Dragonfly used in production
Matthew Dillon
dillon at apollo.backplane.com
Mon Jun 22 11:27:10 PDT 2009
:> Hammer does not have have long fsck periods; that was one of the design
:> goals.
:>
:> There's no software RAID as part of Hammer right now; you could use CCD or
:> maybe vinum? This isn't directly helpful, but I usually favor hardware
:> RAID when possible.
:
: If you are using SATA (or IDE) drives you should be able to use
:nataraid (see nataraid and natacontrol man pages).
:
:--
I'm not really happy with any of our current solutions, though vinum
comes closest to fitting the bill. The fake-raid stuff is too
vendor-dependant.. I'd rather just have a small boot partition
on every disk in the system or, if it were really that important, one
could simply boot off of a solid state drive which has the same MTBF
as the computer's motherboard. Booting off of a SSD is far more
reliable then nataraid with 2 or 3 normal hard drives.
From my perspective, then, this means that kernel-supported software
RAID is the way to go. e.g. vinum (if it could be made reliable enough),
or we could work up our own solution. The mechanics of a software
RAID system are not all that complex in actual fact. In fact,
soft-raid mirroring would work quite nicely with HAMMER because HAMMER
already deals with any inconsistencies which might occur due to a
system crash (e.g. due to the soft-raid not being able to complete
an I/O operations across all the disks making up the operation).
On system recovery HAMMER will run its undo and thus re-write any
inconsistent sectors. A soft-raid system will thus see the re-writes
and thus be able to recover the consistency for those particular sectors.
This means the soft-raid system would only really have to deal with
inconsistencies which build up due to an actual disk failure. It
would NOT have to deal with transactional inconsistencies due to a
system crash. Anyone who knows how RAID works should know that not
having to deal with transactional inconsistencies reduces the complexity
of the RAID implementation by an order of magnitude.
I think this is a more appropriate way to separate the RAID
functionality from the filesystem functionality. The FS has just
enough logic in it to deal with inconsistencies due to system
crashes and the RAID system only has to deal with inconsistencies
due to actual disk failures. Win. Win.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list