Root mount failed:5

Matthew Dillon dillon at apollo.backplane.com
Fri Oct 12 08:18:43 PDT 2012


:On Thu, Oct 11, 2012 at 8:38 PM, Matthew Dillon
:<dillon at apollo.backplane.com> wrote:
:>
:>     * 'hammer show' output might help me figure out what happened, but
:>       won't help you recover the data.
:>
:
:The txt file is 20GB and when bzipped 1.3 GB.
:|MD5 (bd-show.txt.bz2) = 9727a126b93316dba194ff596e9661ff
:
:I am uploading it to leaf:/home/sgeorge/crash/2012-10-12/bd-show.txt.bz2
:
:Will take 5 hours from now to complete
:
:Thanks
:
:Siju

    Ok, I see it uploading.  I'll take a look at it when it's done.
    Here's what I think I'll likely see:

    * That the live dedup hosed the blockmap and it finally hit the wrong
      sector and blew the mount up.

    * That you will have to run a 'hammer recover' scan on the unmounted
      device to recover the filesystem to another drive.

    * And then reformat the original drive and stop using live dedup.

    Basically I think it is really likely that the live dedup hosed the
    filesystem.  We changed the default to OFF last release but I'm starting
    to think that I just have to remove it entirely.  We would still have
    the background dedup, of course... that seems to work fine.

    The problems with live dedup that we have seen to date have primarily
    been a slow corruption of the blockmap.  Eventually this creates a
    situation where HAMMER frees space that isn't actually free, then 
    reuses it, and that blows up the CRCs on portions of the filesystem.

    The key point here is that the corruption was likely present for weeks
    before it actually blew the filesystem up.  There's basically no way
    to fix it short of reformatting the filesystem.  And, in fact, to be
    safe, ANY filesystem that live-dedup was ever used on is at risk even
    if it is still operational.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>



More information about the Users mailing list