RAID 1 or Hammer
Bjørn Vermo
bv at opera.com
Wed Jan 14 00:03:46 PST 2009
On 13. jan.. 2009, at 04.15, Matthew Dillon wrote:
I've seen uncaught data corruption on older machines, but not in
the
last few years. Ah, the days of IDE cabling problems, remembered
fondly (or not). I've seen bad data get through TCP connections
uncaught! Yes, it actually does happen, even more so now that OS's
are depending more and more on CRC checking done by the ethernet
device.
. ..
Modern (meaning anything with an ATA or SCSI controller in it) drives
will do so much error checking and recovery that the time between
externally noticeable failures and total breakdown will be very short.
I have a number of 7-8 years old hand-me-down IBM Netfinity servers to
use for testing purposes, and the combination of the processing done
by the ServeRaid controllers and the Datastar ultra-320 drives makes
it next to impossible for an error to slip through to the operating
system. I will probably find out soon enough how the eventual
breakdown happens, I have a yellow warning light on on a drive for
about half a year now on a system I'm stress testing. Does not help to
have hot-swappable drives when you have run out of spares...
I still have had errors noticed by JFS or ReiserFS, but they have not
been caused by disk problems. On desktop systems, one of my first
suspects will be power supplies and bad capacitors on the motherboard.
Another suspect is software bugs, and on the servers that is the most
plausible.
--
Bjørn Vermo
Core networking
Opera Software ASA
More information about the Users
mailing list