HAMMER version 4 work in progress (NOT for production systems)
Matthew Dillon
dillon at apollo.backplane.com
Sun Nov 1 17:20:50 PST 2009
I will be committing bits of work over the next week or so related to
HAMMER version 4 into master. This version is marked as WIP (work in
progress). It is *NOT* ready for prime time. Do *NOT* do a forced
upgrade to version 4 at this time except on scratch filesystems that
you don't mind losing, for testing purposes only.
Nearly all of the work is conditionalized on version 4 so running
the master branch on existing version 1-3 filesystems should still
work fine. I don't guarantee it, but I did my best to conditionalize
it for version 4. Version 3 filesystems will format the UNDO space a
bit differently but it should be backwards compatible.
The first bit of work committed today involves changing the layout
of the media structures in the UNDO FIFO such that the crash recovery
code can locate the end of the undo range (and detect any missing
sectors) without having to rely on index range stored in the volume
header. This in turn allows the flush code to remove one of the two
disk synchronization commands which exist in the flush path.
The next bit of work will involve adding forward-looking REDO records
to improve lseek+write+fsync sequences. File creation will still have
to go through the normal full flush but once the inode is on-disk my
hope is I can work out a way to lay down a REDO record for a write()
such that a fsync() only needs to flush UNDO blocks, and not have to
flush the volume header or meta-data. The crash recovery sequence
would first undo meta-data changes, then re-run higher level REDO
operations on related inode(s). Theoretically we can implement REDO
for anything, including file creation, but realisitically its most
important use is for small lseek+write+fsync sequences to support
database operations.
Needless to say if I can get REDO working then fully-crash-recoverable
fsync operations will become very, very fast.
Version 4 is not yet ready for any serious testing. If you do wish to
test it the paths being modified here are the crash recovery paths and
the only real way to test it is by pulling the SATA cable out of the
drive during heavy disk I/O (do NOT pull the power on a drive), or
panicing the system on purpose to simulate a crash, and then observe
whether the crash recovery works on mount after reboot. Testing inside
a vkernel might also be useful but you need a fairly large HAMMER virtual
disk to test realistically.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list