Off-Topic Question

Matthew Dillon dillon at apollo.backplane.com
Mon Mar 7 12:09:55 PST 2005


    Right... there will be no background fsck in DFly.

    Our filesystem work can be broken down into two major areas:

    * High level journaling.

	I'm about 1/3 of the way through.  People can play with the current
	high level journaling code using mountctl(8) and jscan(8).  High 
	level journaling is operations-level journaling.  It is NOT suitable
	for fast recovery after a crash, it is designed to do things like, 
	oh, real time off-site backups, undo, redo, infinitely fine grained
	incremental backups, mirroring, file versioning, security audits,
	system monitoring, acting as a data transport layer for the later
	clustering work, and a billion other useful things.

	When used as a backup medium the typical methodology would be to
	use the journal to keep a running mirror on a different system and
	to use any remaining disk space on the target to hold as much of
	the historical journaling stream as possible to give you the ability
	to restore anything as of any point in history (that you still have
	journaled data for).

	When used as a security/audit medium you would backup the journaling
	stream itself, pretty much forever.

	When used for undo/redo/versioning you would index the journal on
	the fly to quickly be able to access data related to directory
	subhierarchies and then use that to access the necessary data quickly.

    * Generic Low level journaling (a few months away at least, probably
      longer)

	This is actually easier to accomplish but is only really useful for
	fast crash recovery so I'm doing it second.  The idea here is to
	have a block level layer to generate a reversable block level journal
	and then have the filesytem provide hints to it as to where the
	good restore-to points are.

	With this methodology the filesystem would be able to operate fully
	asynchronously and the block level journal layer would ensure that
	the undo information is synchronized out to the journal prior to
	the actual block write being performed.  Fast recovery is then a
	matter of re-running the last 30-60 seconds worth of the journal
	to the most recent hinted-at recovery point.

	Only around ~5 minutes worth of the journal would have to be kept.

	If someone else wants to work on this, it's completely independant
	of the HL journaling work so it wouldn't intefere with what I am
	currently working on.  There are a bunch of side issues, especially
	related to how the hinting should work to deal with certain
	inter-dependant data situations (that occur most often when one is
	renaming and rm -rf'ing lots of files simultaniously), but those
	don't have to be dealt with immediately.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Bugs mailing list