RAID 1 or Hammer

Dmitri Nikulin dnikulin at gmail.com
Mon Jan 12 21:06:52 PST 2009


On Tue, Jan 13, 2009 at 2:15 PM, Matthew Dillon
<dillon at apollo.backplane.com> wrote:
>    I've seen uncaught data corruption on older machines, but not in the
>    last few years.  Ah, the days of IDE cabling problems, remembered
>    fondly (or not).  I've seen bad data get through TCP connections
>    uncaught!  Yes, it actually does happen, even more so now that OS's
>    are depending more and more on CRC checking done by the ethernet device.

I once came across a machine in which the IDE cable was plugged in
wrong. It was plugged a whole row over, leaving the end two pins
unconnected, and offsetting each pin that was connected. Somehow the
machine worked fine and only under high load would it start to give
DMA errors, drop down the DMA level (which any OS I've seen does
automatically), and continue slower but stable. I found no evidence of
data corruption and the machine worked at full speeds when the cable
was moved.

One of my own machines had been supplied with a bad ASUS-labelled IDE
cable, which exhibited similar symptoms to the one above that was
plugged in wrong. From both incidents I learned that IDE is pretty
robust on modern chipsets.

I knew I wasn't just paranoid about ethernet bugs. That's why I run
mission-critical links over SSH or OpenVPN so that a proper checksum
and replay protocol take care of it.

>    ZFS uses its integrity check data for a lot more then simple validation.
>    It passes the information down into the I/O layer and this allows the
>    I/O layer (aka the software-RAID layer) to determine which underlying
>    block is the correct one when presented with multiple choices.  So, for
>    example, if data is mirrored the ZFS I/O layer can determine which of
>    the mirrored blocks is valid... A, B, both, or neither.

Actually that's precisely what I like about ZFS, the self-healing with
redundant copies. It means that as long as any corruption is healed
before the redundant copy is similarly damaged, there will be
basically no loss of data *or* redundancy. This is in stark contrast
to a plain mirror in which one copy will become unusable and the data
in question is no longer redundant, and possibly incorrect.

>    People have debunked Sun's tests as pertaining to a properly functioning
>    RAID system.  But ZFS also handles any Black Swan that shows up in the
>    entire I/O path.  A Black Swan is an unexpected condition.  For example,
>    an obscure software bug in the many layers of firmware that the data
>    passes through.  Software is so complex these days there are plenty of
>    ways the data can get lost or corrupted without necessarily causing
>    actual corruption at the physical layer.

I confess that, lacking ZFS, I have a very paranoid strategy on my
Linux machines for doing backups (of code workspaces, etc). I archive
the code onto a tmpfs and checksum that, and from the tmpfs distribute
the archive and checksum to local and remote archives. This avoids the
unthinkably unlikely worst case where an archive can be written to
disk, dropped from cache, corrupted, and read bad wrong in time to be
checksummed. The on-disk workspace itself has no such protection, but
at least I can tell that each backup is as good as the workspace was
when archived, which of course has to pass a complete recompile.

-- 
Dmitri Nikulin

Centre for Synchrotron Science
Monash University
Victoria 3800, Australia





More information about the Users mailing list