Root mount failed:5
Matthew Dillon
dillon at apollo.backplane.com
Fri Oct 12 08:18:43 PDT 2012
:On Thu, Oct 11, 2012 at 8:38 PM, Matthew Dillon
:<dillon at apollo.backplane.com> wrote:
:>
:> * 'hammer show' output might help me figure out what happened, but
:> won't help you recover the data.
:>
:
:The txt file is 20GB and when bzipped 1.3 GB.
:|MD5 (bd-show.txt.bz2) = 9727a126b93316dba194ff596e9661ff
:
:I am uploading it to leaf:/home/sgeorge/crash/2012-10-12/bd-show.txt.bz2
:
:Will take 5 hours from now to complete
:
:Thanks
:
:Siju
Ok, I see it uploading. I'll take a look at it when it's done.
Here's what I think I'll likely see:
* That the live dedup hosed the blockmap and it finally hit the wrong
sector and blew the mount up.
* That you will have to run a 'hammer recover' scan on the unmounted
device to recover the filesystem to another drive.
* And then reformat the original drive and stop using live dedup.
Basically I think it is really likely that the live dedup hosed the
filesystem. We changed the default to OFF last release but I'm starting
to think that I just have to remove it entirely. We would still have
the background dedup, of course... that seems to work fine.
The problems with live dedup that we have seen to date have primarily
been a slow corruption of the blockmap. Eventually this creates a
situation where HAMMER frees space that isn't actually free, then
reuses it, and that blows up the CRCs on portions of the filesystem.
The key point here is that the corruption was likely present for weeks
before it actually blew the filesystem up. There's basically no way
to fix it short of reformatting the filesystem. And, in fact, to be
safe, ANY filesystem that live-dedup was ever used on is at risk even
if it is still operational.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list