modifying nullfs

Matthew Dillon dillon at apollo.backplane.com
Fri Sep 7 18:44:26 PDT 2012


:honestly, i think about some kind of abstraction layer over HammerFS, 
:that's why a stackable FS impressed me.

    Stackable FS's are always interesting, but they are also full of
    problems.

    The NFS server implementation is a good example.. when you export
    a filesystem via NFS the NFS client has to talk to the NFS server
    and that's essentially creating a stacking layer on top of the
    original filesystem being exported by the server.

    There are three primary problems with any stacking filesystem:

    * Coherency if someone goes and does something to a file or directory
      (like remove or rename it) via the underlying filesystem.  The
      stacked filesystem doesn't know about it.

    * Tracking the vnode associations is particularly difficult because
      you can't just keep the pairs of vnodes (the overlayed vnode and
      the underlying vnode) referenced all the time.  There are too many.
      In particular, even just leaving the underlying vnodes referenced
      creates a real problem for the kernel's vnode cache management code
      because it can only hold a limited number of vnodes.

      (The NFS server handles this by not keeping a permanent ref on the
      vnodes requested by clients.  Instead it can force clients to
      re-lookup the filename and re-establish any vnode association it
      had removed from the cache.  It works for most cases but does not
      work well for the open-descriptor-after-unlinking case and can cause
      serious confusion when multiple NFS clients rename the same file or
      directory).

    * And overhead.  When you have a stacked filesystem (such as a NFS
      server), verses a filesystem alias (such as NULLFS), the stacked
      filesystem has considerable kernel memory overhead to track the
      stacking which creates a memory management issue if you try to
      stack very large filesystems.o

    Another example of a stacked filesystem would be the UFS union mount
    (unionfs) in FreeBSD.  It was removed from DragonFly and has had
    endemic problems in FreeBSD for, oh, ever since it was written.  It
    depended heavily on the 'VOP_WHITEOUT' feature which is something only
    UFS really supports, and not very well at that because directory-entry
    whiteouts can't really be backed up.  The union filesystem tried to
    stack two real filesystem and present essentially a 'writable snapshot'
    as the mount.

    So it's a very interesting area but complex and difficult to implement
    properly under any circumstances.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Users mailing list