Portable vkernel (emulator)

Matthew Dillon dillon at apollo.backplane.com
Fri Jul 11 10:06:28 PDT 2008


:So it's a good thing there are companies like Red Hat paying people
:full time to maintain those implementations. Support contracts get the
:users of the software paying for its maintenance, and it works very
:well for Linux.
:
:ZFS has Sun employees working around the clock too. And now they have
:the FreeBSD community helping out as well, and serving as the first
:real-world port. If licensing can be worked out and Linux gets ZFS,
:it's easy to see it gaining a huge user base and no shortage of
:testers.

    I don't think I would describe it quite like that.  While there might
    be commercial support, the issue is primarily that only a few people
    will understand the filesystem code well enough to actually work on it,
    and it doesn't really matter whether it is being pushed by a large 
    company or being pushed by an individual.  In fact, if you look at the
    commercial offerings today, the rate of development of their products
    over the years is nowhere near what it would have been had it been
    possible to just throw more programmers at the problem.

    Many companies wind up being stuck with technologies they no longer
    have the expertise to develop... probably the best example of this
    is in the CAD/CAM Arena, and the FPGA arena.  Or flight scheduling in
    the airline industry.  All they are able to do is add pretty UIs on
    top of the original code, and enhace bits around the edges.  This
    constricts the rate at which the filesystem can be developed and
    creates a development curve over the years. 

    Assigning more programmers to the task won't really help matters all
    that much.  The best a commercial company can do is assign people to
    take all the annoying bits off the primary programmer's table so
    they can focus more on the parts that nobody else understands (i.e.
    the filesystem core).  If that doesn't do the job then they use a nice
    little euphuism called end-of-life to phase out the project.

    Not likely to happen to a filesystem like ZFS any time soon, but the
    hype that is applied to a commercial company's ability to support a
    product goes far beyond their actual ability to support that product.
    What would be a good example of that... hmmm..  Probably SGI'x XFS
    would be a good example.  It is a filesystem that is in wide use on
    Linux, but where primary development has pretty much been frozen
    insofar as I can tell.  All the work done on it in the last 3 years
    has been around the edges.

:>    UFS went essentially undeveloped for 20 years because the original
:>    developers stopped working on it.  All the work done on it since FFS
:>    and BSD4-light (circa 1983-1991-ish) has mostly been in the form of
:>    hacks.  Even the UFS2 work in FreeBSD is really just a minor extension
:>    to UFS, making fields 64 bits instead of 32 bits and a few other
:>    little things, and nothing else.
:
:I agree they're hacks, but it's certainly worth appreciating that
:things like soft updates and background fsck (the most terrifying hack
:of all) have kept FreeBSD relevant even without a proper journalling
:file system.

    Softupdates was a good stopgap, though I will note that people started
    using softupdates while it was still quite buggy, and in fact softupdates
    had serious bugs for years after it first started getting used.  That
    did not stop its wide deployment, though.  But softupdates is a
    ridiculously complex beast.  Exactly three people (including Kirk) in 
    the entire world understand it well enough to work on it now. 

    Most of the expansion projects for UFS, such as the snapshot support
    and the journaling layer, appear dead in the water.  Both Ext and Reiser
    also had very long periods of idleness, though Ext recovered.  ReiserFS
    probably won't, for obvious reasons, even though Reiser4 looks pretty
    awesome to me.  I'm kinda wondering what the Linux community will do.
    Is any work being done on IBM's JFS any more?  I haven't heard
    anything recently.

    The actual number of filesystems under active and continuing development
    is like 25% of the total available.

:>    Ext, Reiser, Coda... I think every single filesystem you see on Linux
:>    *EXCEPT* the four commercial ones (Veritas, IBM's stuff, Sun's ZFS,
:>    and SGI's XFS) are almost single-person projects.  So I'm not too worried
:>    about my ability to develop HAMMER :-)
:
:Right, but who's going to test it? ext3 is being tested on almost
:every Linux desktop today, and many servers. While it doesn't have
:many "developers", looking at patches being applied, lots of people
:make contributions if problems do come up. I agree that pound for
:pound, DragonFly's user base has a much higher proportion of
:developers and eager testers, but they might not necessarily test the
:implementations ported to other systems, which are the ones that will
:get tens of thousands of users instead of hundreds.

    A thousand eyeballs aren't going to help the programmers any better then
    a hundred eyeballs.  No major filesystem ever developed was developed
    with a thousand people testing it.  It's only after the filesystem is
    pounded into reasonably good shape by the original developers that
    real use cases begin.  This might be a bit less true of ZFS, but it is
    certainly the case for OpenSource-originated filesystems (and, generally,
    the case for OpenSource as a whole).

    It still comes down to one or two programmers and they still have only
    so many hours a day in which they can work on the project.  Frankly,
    its somewhat of a race.  If the code isn't good enough for fairly wide
    deployment the feedback becomes a repetative deluge that is more of
    a detriment to the people doing the work then it is a help.

:>    Yah, I read about the linux work.  That was mainly IBM I think,
:>    though to be truthful it was primarily solved simply by IBM reducing
:>    the clock interrupt rate, which I think pushed the linux folks to
:>    move to a completely dynamic timing system.
:
:I think the dynamic timing system was more motivated by power savings
:than virtualisation, but it's a nice bonus.

    Taking a thousand interrupts a second eats no significant power.  The
    cpu is still 99.99% idle.  However, entering and leaving power savings
    modes was considerably more costly two years ago then it will be two
    years from now.  Both Intel and AMD are moving towards core power savings
    implementations based around HLT (or something similar), which takes all
    the effort out of switching between power savings modes and takes all
    the effort out of switching cpu frequencies.
    
    We'd have to dig up the threads to see what the original thinking was.
    I think its a good idea anyway, just from a design standpoint, but
    I think it's more a matter of supporting different subsystems needing
    different frequencies then it is a matter of trying to save power.

:>    Similarly for something like FreeBSD or DragonFly, reducing the clock
:>    rate makes a big difference.   DragonFly's VKERNEL drops it down to
:>    20Hz.
:
:Would DragonFly be able to implement dynamic ticks as well? Perhaps
:it's not a huge priority but it's something people expect for modern
:power-efficient systems. It may not matter quite so much when the CPU
:is a tiny part of the total system power, but with 8 CPUs it adds up
:pretty quickly.

    We already do, but several subsystems still schedule periodic timers
    using the facility (albeit at different frequencies).  So e.g. the
    thread scheduler uses 20Hz and 100Hz, the callout scheduler uses
    100Hz, network polling (if enabled) uses 1000Hz, and so forth.  Our
    timers are not tied together like they are in FreeBSD, so we can
    control the various subsystems independantly, but we are still taking
    300-1000 interrupts per second on each cpu.

    Again, though, on real hardware this doesn't really use all that much
    power.

:
:>    No open source OS today is natively clusterable.  Not one.  Well, don't
:>    quote me on that :-).  I don't think OpenSolaris is, but I don't know
:>    much about it.   Linux sure as hell isn't, it takes reams of hacks to
:>    get any sort of clustering working on linux and it isn't native to the
:>    OS.  None of the BSDs.  Not DragonFly, not yet.  Only some of the
:>    big commercial mainframe OSs have it.
:
:That's probably because nobody really cares. Clustering is almost
:universally done on the application level, where it can be optimized
:much better to the specific work being done. Mass deployment is a
:solved problem. Machines are getting bigger and more scalable and more
:parallel and cheaper, significantly weakening the argument for
:multiple slow machines clustered together.

    I would disagree.  It's not that nobody cares, its that nobody has a
    choice BECAUSE there is no native support.  Working with the clustering
    libraries is nasty as hell.  It is NOT fun and is probably the reason
    why cluster-based programming has been progressing so slowly.

    Native clustering does not prevent one from adding user-layer
    optimizations, what it does is making the basic programming task
    more abstract and a lot easier to work through.

:>    Yah, and it's doable up to a point.  It works greats for racks of
:>    servers, but emulating everything needed for a workstation environment
:>    is a real mess.  VMWare might as well be its own OS, and in that respect
:>    the hypervisor support that Linux is developing is probably a better
:>    way to advance the field.  VMWare uses it too but VMWare is really a
:>    paper tiger... it wants to be an OS and a virtualization environment,
:>    so what happens to it when Linux itself becomes a virtualization
:>    environment?
:
:The Linux community wants Linux to be everything. And to some degree,
:I agree. Just having at least one open source solution for every
:problem is a great economic freedom. The network effect of Linux as a
:whole is what makes it so much more powerful than any individual
:product it eventually replaces. VMware will probably end up rebasing
:on Linux to some degree.
:
:http://en.wikipedia.org/wiki/VMware_ESX_Server#Architecture
:
:Oh.
:
:-- 
:Dmitri Nikulin

    I'd agree with that.  Its a kitchen sink approach created by many
    individual needs, glued together (with varying levels of success)
    by the distributions.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>






More information about the Users mailing list