CRC failures

Matthew Dillon dillon at backplane.com
Wed Nov 23 07:43:32 PST 2016


I split issues into two categories - corrupted media and software bugs.
CRC failures are generally related to corrupted media and ideally we would
not want those to cause panics.  Software bugs we want to cause panics as
quickly as possible so we can track down and fix them.  What HAMMER1 is
missing here is a recovery mechanism to delete the corruption to allow the
filesystem to continue to be used.

Media corruption appears to most often be associated with system panics and
unexpected power loss.  System panics are not supposed to cause filesystem
corruption at all and I'm still scratching my head and why it happens
sometimes (it would be a good area to work on with HAMMER1).   Unexpected
power loss can cause media corruption no matter what the filesystem does.
Owing to how hard drives work these days, sectors completely unrelated to
the sectors being written at the time of the power failure can become
corrupt.  Most consumer SSDs can also have similar failure modes when an
unexpected power loss occurs.

-Matt

On Wed, Nov 23, 2016 at 6:05 AM, Tomohiro Kusumi <kusumi.tomohiro at gmail.com>
wrote:

> CRC failure stuff in hammer1 is basically a mismatch between ondisk
> CRC vs whatever data's CRC when that data is read from a disk, so it's
> likely that either was corrupted.
>
> Many of the CRC failures end up calling panic() which leads to kernel
> panic (but not this one). There was a discussion on irc a few weeks
> ago about how fs should handle CRC failure, because some users would
> rather prefer to leave the fs as it is without going into panic so the
> fs is at least usable at the moment with broken data or meta data.
>
> (I wouldn't go far as to make any change with regards to consequence
> of CRC failure. It's a design level stuff which should be up to
> dillon at .)
>
>
> 2016-11-23 17:43 GMT+09:00 PeerCorps Trust Fund <ipc at peercorpstrust.org>:
> > Hi all,
> >
> > I don't think I have seen anywhere else in past mailing lists an example
> of
> > corruption on a HAMMER filesystem. But I loaded an old disk today with
> some
> > nonessential files and began copying them off to another machine. Towards
> > the end of the copy operation the console printed out the the below.
> >
> > Perhaps this may be useful information to someone who someday finds
> > something similar in their console.
> >
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390269000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026a000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026e000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026d000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390268000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026b000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026f000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e39026c000/4096 FAILED
> > hammer_load_node: CRC B-TREE NODE @ 800000e390267000/4096 FAILED
> >
> > --
> > Michael
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20161123/3c88f780/attachment-0003.html>


More information about the Users mailing list