Description of the Journaling topology

Thu Dec 30 10:15:58 PST 2004

Matthew Dillon <dillon at xxxxxxxxxxxxxxxxxxxx> writes:

> :Barely understanding the implication of this concept it strikes me 
> :mostly logical, clean and relative simple.
> :Which makes me curious why other project haven't done this already?
> :What is the major reason that other project follow a different path then 
> :this one?
> :
> :-- 
> :mph
>
>     The concepts aren't new but my recollection is that most journaling
>     implementations are directly integrated into the filesystem and this
>     tends to limit their flexibility.  Making the journaling a kernel 
>     layer and taking into account forward-looking goals really opens up
>     the possibilities.  Forward-looking is not something that people are
>     generally good at in either the open-source or the commercial world.
>     (proof of concept: why ext3 is such a mess, why existing journaling
>     implementations are so limited in scope).

Mac OS X also has VFS journaling... at least in part.  But I think it's more
of a front-end/back-end system without the FD based "streaming" stuff 
you've mentioned.  I think HFS+ is the only filesystem that's currently
implementing the back-end of it all.

This is a very powerful concept you've got here.

Who knows what will be available in Tiger... no one has access to the kernel
sources yet.

>>     Our journaling layer is designed to address these issues.  Providing a
>     high level filesystem operations change stream off-site is far more
>     robust then providing a block device level change stream.  Being able
>     to go off-site in real-time to a secure (or more secure) machine can't
>     be beat.  Being able to rewind the journal to any point in time, 
>     infinitely fine-grained, gives security managers and sysops and even
>     users an incredibly powerful tool for deconstructing security events
>     (e.g. log file erasures), recovering lost data, and so on and so forth. 
>     These are very desireable traits, yah?  
>

Yah :).  Speaking of VFS coolness.  Do you think there is room to do a 
private, per-process namespace implementation similar to that of Plan 9/Inferno.

This has greatly helped Plan 9 on grid installations make sense of the vast
array of filesystem servers it can connect to in a large deployment situation.

I'm planning a post regarding this and it's capabilities to kernel once I get
my thoughts organized.  I think DragonFly is the perfect environment for such
an implementation given Matt's dedication to fixing and improving the VFS layer.
It would make chroot's/jails very cheap and incredibly common :).  I also think
the functionality would make a lot of sense on SSI clusters.

More later... [I'm even willing to do some of the work on this one... or all of
it if I can grok the VFS.]

Dave
>     --
>
>     So why hasn't it been done or, at least, why isn't it unversal after all
>     these years?
>
>     It's a good question.  I think it comes down to how most programmers
>     have been educated over the years.  Its funny, but whenever I build
>     something new the first question I usually get is "what paper is your
>     work based on?".  I get it every time, without fail.  And every time,
>     without fail, I find myself trying to explain to the questioner that
>     I generally do not bother to *READ* research papers...  that I build 
>     systems from scratch based on one or two sentence's worth of concept.
>
>     If I really want to throw someone for a loop I ask him whether he'd
>     rather be the guy inventing the algorithm and writing the paper, or
>     the guy implementing it from the paper.  It's a question that forces
>     the questioner to actually think with his noggin.
>
>     I think that is really the crux of the problem... programmers have been
>     taught to build things from templates rather then build things from
>     concepts... and THAT is primarily why software is still stuck in the 
>     dark ages insofar as I am concerned.  True innovation requires having
>     lightbulbs go off above your head all the time, and you don't get that
>     from reading papers.  Another amusing anecdote... every time I complained
>     about something in FreeBSD-5 or 6 the universal answer I got was that
>     'oh, well, Solaris did it this way' or 'there was a paper about this'
>     or a myrid of other 'someone else wrote it down so it must be good'
>     excuses.  Not once did I ever get any other answer.  Pretty sad, I think,
>     and also sadly not unique to FreeBSD.  It's a problem with mindset, and
>     mindset is a problem with our educational system (the entire world's).
>
>     I'm really happy that DragonFly has finally progressed to the point where
>     we can begin to implement our loftier goals.  Up until now the work has
>     been primarily ripping out and reimplementing the guts of the system with
>     very little visibility poking through to the end-user.  Now we are
>     are starting to push into things that have direct consequences to the
>     end-user.  The journaling is one of the three major legs that will
>     support the ultimate goal of single-system-image clustering.  The second
>     leg is a cache coherency scheme, and the third will be resource sharing
>     and migration.  All three will have to be very carefully and deliberately
>     integrated together into a single whole to achieve the ultimate goal.
>
>     This makes journaling a major turning point for the project... one,
>     I hope, that attracts more people to DragonFly.
>
> 						-Matt