ZFS (fwd)

Matthew Dillon dillon at apollo.backplane.com
Thu Jun 22 09:46:56 PDT 2006


:As far as I can tell, it probably won't happen until the userland VFS
:work, which won't happen until the MP work, which was going along
:great but with no commits for a while now (not necessarily a bad sign,
:just means we can't tell what's happening). Meanwhile, you have
:OpenSolaris, which should be comparable in stability and performance
:to DragonFly, and can utilize pkgsrc :)
:
:  -- Dmitri Nikulin

    Yah, its a chicken and egg problem.    The key issue with the MP work
    is to get it down into the VFS interface (it isn't at the moment).
    Then userland VFS can proceed.

    At the moment I'm stuck trying to figure out the best way to make the
    namecache MP safe.  I'm trying to avoid FreeBSD's master mutex approach.
    Nor do I want to have a per-namecache-record spinlock since this would
    drastically reduce performance on long paths.

    I think the best solution might be to have a 'zone' spinlock for the
    namecache.  The idea is that the namecache records would have a pointer
    to a spinlock structure which might be shared across multiple records.
    Initially the 'zone' would be the entire namecache.  When a conflict
    occurs the namecache would determine if it would be beneficial to split
    the namecache's spinlock.

    For example, lets say we were running a mail system with multiple 
    mail queues.  queue, queue/a, queue/b, and queue/c.  The entire hierarchy
    would start out sharing a spinlock.  As the load causes spinlock conflicts
    the namecache would break-out the mail queue directories, giving them 
    their own spinlock.

    One approach to this would be to have a fixed pool of spinlocks, say
    128.  Theoretically one does not need any more then the maximum amount
    of parallelism one desires in the namecache.  Spinlocks from the pool
    would be assigned to namecache records and the system would choose a
    'random' spinlock from the pool when it decides it needs to reassign
    one.

    Because reassignment is a dynamic operation, a fixed pool gives us
    the ability to maintain consistency during the reassignment by
    guarenteeing that the spinlock itself would not be destroyed by
    creation/deletion operations in the namecache.

    In anycase, I'm starting to shift into release mode.  I will pick up
    the namecache work after we release in mid-July.  I've made a considerable
    amount of progress with the MP work just locking up the file descriptors
    and file pointers, so its good to take a break and make sure that all
    that work is solid before doing another push.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Kernel mailing list