HAMMER recovery and other questions

Matthew Dillon dillon at apollo.backplane.com
Mon Jun 23 15:31:39 PDT 2008


:1) When I have a nohistory mount and have a, say, power breakdown while
:writing data to it, will the transactions be still re-run after
:rebooting?

    The filesystem will be in a consistent state upon re-mounting after
    the crash, but if you didn't explicitly sync your operations to
    disk HAMMER could unwind up to 30 seconds or so worth of operations
    in order to get the fs back into a consistent state.

    If you fsync() then all data written to the file object in question
    prior to the fsync, plus any related directory infrastructure, is
    guaranteed to be recovered after a crash if the fsync() returns before
    the crash occurs.

    NOTE!  There are two side issues, one of which I can fix and one of
    which I cannot.  The first is that we haven't implemented the hardware
    disk flush command through the device driver so if the hard disk lies
    about the I/O being complete (and most do), HAMMER's flush sequence
    may wind up being imperfect.  That is, the drive could write the volume
    header before finishing writing the UNDO blocks and cause the crash
    recovery code to fail.  It's fairly easy to add that feature but I
    wonder if someone else could do it :-).  FreeBSD did add that feature
    and it didn't look too complicated.

    The second thing to note is that if you physically pull the plug on a
    hard drive which is in the middle of writing something, you can lose
    the hard drive... and I'm not talking about just losing one or two
    sectors.  I mean you can lose several tracks, even data you weren't
    writing out but which was simply nearby.  You can easily lose the
    entire drive.  A system crash is one thing, an uncontrolled power-down
    is quite another.  Drive manufacturers aren't willing to spend the 
    $0.10 required to put in a big enough capacitor to put the drive into
    a safe state on power failure.

    Some time last year I was running three raid arrays but didn't have the
    UPS's smart status feature hooked into the computing equipment.  A power
    failure occured and I lost three drives.  Poof, three dead drives.  Now
    I have the UPS hooked in to the computers with apcupsd so the computers
    shut down before the UPS does.

:2) If I understand well, I do a synctid and then I do the softlink on
:the transaction ID and (soft)prune after e. g. 2 days -- does this
:mean all my history gets deleted except for the last 2 days. Is this
:correct?

    Yes.  HAMMER will delete all history prior to that softlink's
    transaction id but will retain all history after it.  Thus the
    history from that transaction id on to 'now' will remain fine-grained.
    (If it doesn't do that tell me, because that's how it is supposed
    to work).

:3) If I make 3 softlinks, like soft1 (made 4 days ago), soft2
:(made 3 days ago) and soft3 (made 1 day ago), delete soft2 and
:softprune, will _all_ the changes done between soft1 and soft2 also be
:deleted?

    Yes.  Once you remove soft2 and then run the prune command any
    history between soft1 and soft3 will be destroyed, including history
    that was previously retained in order to support the soft2 snapshot.
    Now that soft2 is gone, that history will be destroyed.

    All history prior to soft1 will be destroyed, and any history after
    soft3 (from soft3 to current) will be retained and remain 
    fine-grained.

:4) Feature suggestion: I think for a little bit more comfortable
:operation, there should me a command that automatically creates a
:softlink. Like: hammer snap /path/to/softlink which does a synctid and
:creates the softlink in the desired path. That way one would not be
:forced to retrieve the transaction ID and create softlinks manually. Or
:have I missed something and you already have implemented this? :-)

    It's a good idea.  Go ahead and add it to the hammer utility.
    Maybe call it 'hammer snapshot <softlink-directory> [<filesystem>]'
    (where the filesystem need only be specified if the softlink 
    directory is not in the desired filesystem).

:5) Bug report: please add the nohistory flag to the chflags man
:page. :-)

    (Sascha can you do that for us?)

:6) While we are at nohistory: is it possible to have a fully
:nohistory'd volume with only specific directories for which the user
:would like to retain the history?

    Sure.  The chflags flag propogates automatically so just start
    out by chflagging -R the entire mess nohistory, then chflagging
    -R the bits you want history to be retained on 'history'.  Once
    you've done that any new objects created under directories with
    nohistory set will also be nohistory, and any new objects 
    created under directories which allow history will also allow
    history.

:TIA for the answers and sorry if some of my questions are related to
:straightforward things, I am writing from a user's POV.
:
:-- 
:Gergo Szakal MD <bastyaelvtars at gmail.com>

    I'm going to add an addendum here with regards to the upcoming mirroring
    support.  I am not confident that I can make mirroring work well for
    files marked 'nohistory'.  I will try, but there's a lot of complication
    involved due to the lack of history (and thus the lack of B-Tree
    elements showing what got deleted).  Ultimately such support will be
    required, but I don't know if I can fit that level of sophistication
    into the 2.0 release.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Users mailing list