DragonFly 2.6.2, 2.7.2 tags pushed - fixes for serious HAMMER issue

Matthew Dillon dillon at apollo.backplane.com
Mon May 17 16:43:11 PDT 2010


:
:>The corruption can only occur if your HAMMER filesystem became full
:>or nearly full sometime in the last 45 days or so with a kernel built
:>sometime in the last 45 days.  To check for the corruption you need
:>an unmounted or completely idle filesystem and then run (using the
:>latest hammer utility):
:..
:>    hammer -f <device> show | egrep '^B' | egrep -v '^BM'

:I was hit, but think my FS has been under 90% full at all times.
:(checked in single user, w/ r/o mount)
:
:Any way to find out which files (history) are affected?

    If the filesystem is not idle you can try sync'ing a few times
    and running the hammer -f <device> show | egrep... stuff several
    times to see if the output changes.

    Only manually by locating the errors in the show output and backtracking
    the object id (inode number) to the directory entry.  In that case
    you would have to dump the entire show output to a file, which could
    end up being gigabytes depending on the size of the filesystem.

    hammer -f <device> show > somefile
    less somefile
    /^B

    (but ^BM has to be ignored since those represent mirror_tid errors
    which are probably all over the place prior to the fix which went
    into 2.6).

:>If using mirror-read to copyoff remember it must be run on every PFS
:>individually, and bulk mode (-B) is recommended, and make sure any
:>backups are viable before smashing the original filesystem.
:
:Why is -B recommended?
:In hammer.8 -B is 'not recommended'; should this just be removed?

    -B works around a bug in the incremental mirroring transaction ids
    stored in the B-Tree which was fixed for the 2.6 release but existed
    prior to that.  The bug is self-correcting in that modifications made
    after the bug was fixed will properly deal with the mirror_tid in
    the B-Tree.

:Any way to restore root PFS (#0) fully?
:Root PFS can not be downgraded to slave, for mirror-write,
:so I see no way to get history restored.

    No.  What we really need to do here is get rid of the notion of
    a root PFS entirely and just make all the PFSs operate the same
    way.

    Someone was talking about making it possible to mount the root HAMMER
    filesystem with a PFS # other than 0, as well.  Also very easy to do
    I think, it could be a small mini-project for someone.

    In anycase, ultimately for people who hit this corruption problem
    the best solution, unfortunately, may be to copy off the data and
    newfs the thing from scratch.

:Beware of cpdup'ing root PFS; symlinks for already restored PFSs
:will be overwritten.
:
:Also remember to copy PFS config (if you use non default).
:(I had to restore PFSs twice, as I did 'hammer cleanup' too early)
:
: -thomas

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Users mailing list