cvs commit: src/sys/kern vfs_cache.c vfs_syscalls.c vfs_vnops.c vfs_vopops.c src/sys/sys namecache.h stat.h

Matthew Dillon dillon at apollo.backplane.com
Thu Aug 25 13:18:23 PDT 2005


:
:On Thu, Aug 25, 2005 at 11:34:17AM -0700, Matthew Dillon wrote:
:>   Implement FSMID.  Use one of the spare 64 bit fields in the stat structure
:>   for the FSMID.   The FSMID is a recursively updated field which allows one
:>   to determine whether a subdirectory hierarchy has changed simply by checking
:>   the base directory of the desired hierarchy.  The new field is st_fsmid.
:
:Please don't do it. This kind of functionality can be synthesised by
:imon under Linux oder kqueue on the BSDs. It is therefore redundant.
:The approach doesn't solve most of the problems and just provides the
:means necessary to detect something changed, it still needs to recurse
:into the directory hierachy. It's IMO also not reliable since a vnode
:change does not necessarily reach all parent directories of with entries
:for this vnode, simply because they might never have been read. It can
:also add a considerable overhead for deeply nested filesystems, which
:shouldn't be done lightly.

    I don't know about imon under Linux, but kqueue on the BSDs doesn't
    even come *close* to providing the functionality needed, let alone
    providing us with a way monitor changes across distinct invocations.
    Using kqueue for that sort of thing a terrible idea.

    I'm not sure I understand what you mean about not reaching all
    parent directories.  Perhaps you did not read the patch set.  It
    most certainly DOES reach all parent directories, whether they've been
    read or not.  That's the whole point.  It goes all the way to '/'.

    And as far as searching directories goes... the whole point is to
    reduce the number of directories that have to be searched to JUST the
    portions of the hiearchy containing the modifications.  If one is
    trying to synchronize a huge filesystem, such as many people now have,
    it is extremely important to be able to restrict such synchronization
    to just the elements that have changed, and to do so without having to
    constantly monitor the entire filesystem.

    It's a very good fit, taking the middle ground between a backup 
    method like tar/dump which must scan the entire filesystem in batch,
    and a live journal which requires real time monitoring of all filesystem
    operations.  FSMID gives you an ability to do tar/dump-like mirror
    synchronizations in batch (distinct invocations, without real time
    monitoring), but without having to scan the entire directory structure
    of a large terrabyte filesystem.  kqueue can't do that, and I really
    doubt that imon could do that either.

    The methodology behind the transaction id assignments can make this
    a 100% reliable operation on a *RUNNING*, *LIVE* system.  Detecting
    in-flight changes is utterly trivial.

    Nesting overhead is an issue, but not a big one.  It's a very solvable
    problem and certainly should not hold up an implementation.  The only
    real issue occurs when someone does a write() vs someone else stat()ing
    a directory along the parent path.  Again, very solvable and certainly
    not a show stopper in any way.

:It should also be kept in mind that persistent storage is almsot fully a
:dream, since no current filesystem allows it nor is it really possible
:to correctly implement the behaviour without adding a lot of nasty hacks
:e.g. restores as well.
:
:Joerg

    Not sure what you mean by no filesystem allowing it.  It's an almost
    trivial matter to add it to UFS.  It certainly isn't difficult.  It
    is certainly entirely possible to correctly implement the desired
    behavior.

    When UFS was originally developed there was some discussion about 
    propogating e.g. ctime back to the root of the mount point.  It wasn't
    done due to conerns about overhead, but I think also because the rest
    of the system simply wasn't designed to be able to accomodate the caching
    infrastructure required to support that sort of thing.  Well, DragonFly's
    new namecache infrastructure is *FULLY* capable of supporting that sort
    of thing, and it would be a lot easier to implement such a beast in
    DragonFly then, say, in FreeBSD.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Commits mailing list