HAMMER version 4 work in progress (NOT for production systems)

Matthew Dillon dillon at apollo.backplane.com
Sun Nov 1 17:20:50 PST 2009


    I will be committing bits of work over the next week or so related to
    HAMMER version 4 into master.  This version is marked as WIP (work in
    progress).  It is *NOT* ready for prime time.  Do *NOT* do a forced
    upgrade to version 4 at this time except on scratch filesystems that
    you don't mind losing, for testing purposes only.

    Nearly all of the work is conditionalized on version 4 so running
    the master branch on existing version 1-3 filesystems should still
    work fine.  I don't guarantee it, but I did my best to conditionalize
    it for version 4.  Version 3 filesystems will format the UNDO space a
    bit differently but it should be backwards compatible.

    The first bit of work committed today involves changing the layout
    of the media structures in the UNDO FIFO such that the crash recovery
    code can locate the end of the undo range (and detect any missing
    sectors) without having to rely on index range stored in the volume
    header.  This in turn allows the flush code to remove one of the two
    disk synchronization commands which exist in the flush path.

    The next bit of work will involve adding forward-looking REDO records
    to improve lseek+write+fsync sequences.  File creation will still have
    to go through the normal full flush but once the inode is on-disk my
    hope is I can work out a way to lay down a REDO record for a write()
    such that a fsync() only needs to flush UNDO blocks, and not have to
    flush the volume header or meta-data.  The crash recovery sequence
    would first undo meta-data changes, then re-run higher level REDO
    operations on related inode(s).  Theoretically we can implement REDO
    for anything, including file creation, but realisitically its most
    important use is for small lseek+write+fsync sequences to support
    database operations.

    Needless to say if I can get REDO working then fully-crash-recoverable
    fsync operations will become very, very fast.

    Version 4 is not yet ready for any serious testing.  If you do wish to
    test it the paths being modified here are the crash recovery paths and
    the only real way to test it is by pulling the SATA cable out of the
    drive during heavy disk I/O (do NOT pull the power on a drive), or
    panicing the system on purpose to simulate a crash, and then observe
    whether the crash recovery works on mount after reboot.  Testing inside
    a vkernel might also be useful but you need a fairly large HAMMER virtual
    disk to test realistically.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Kernel mailing list