Nathan Hawkins utsl at
Mon Sep 13 20:10:06 PDT 2004

Matthew Dillon wrote:

   If you want *real* RAID, you want to use an external raid with a SCSI
   port.  Second choice would be to use a real raid controller.  Third
   choice would be to use a software solution.

In general, I agree with that. But there are some cases where software 
can be more flexible.  A good example being temporarily breaking mirrors 
during OS upgrades. I've found that's more difficult to do with hardware 
raid. (At least the systems I've had access to.) I've also frequently 
used software raid (Veritas) to temporarily mirror a volume, so I could 
move it to a different disk. Not necessarily for disk replacement - most 
often it had to do with keeping heavily accessed database volumes on 
different disks.

Being able to do things like that online, and having full control over 
it through the OS is handy.

   Doing RAID properly in software requires either having battery backed
   ram cache or having reliable off-machine cache (which could be another
   machine on its own UPS, frankly).  I think the off-machine choice is a
   lot more viable these days with the availability of cheap GigE.

The only real difference is where the software runs. At the server, or 
on a controller. Granted, battery backed cache is pretty much necessary 
for write performance now that disks are so large. But whether that's 
even an issue depends on the application.

   Vinum is a rather fragile piece of software, it would not be my first
   choice despite all the fine work done on it.

The user interface seems to be where the problems are. Most people have 
problems with the underlying concepts anyway, so the user interface 
compounds that, and makes it more likely that people will shoot 
themselves in the foot.

   As far as Geom goes... well, I really dislike the idea of having to
   implement complex drivers in the kernel.  I have been slowly cleaning up
   our IO infrastructure to allow IO devices to be properly stacked (our
   disk layer, for example, is now properly stacked), but ultimately I think
   the real win here will be to form a streaming protocol that could run
   over a TCP socket to govern the I/O (not necessarily NAS).  Then one
   would be able to build drivers to run in userland and/or on remote
   machines.  The only real latency issue is, as always, with READ ops,
   but a kernel supported data block cache at the block device level would
   mostly solve that issue.

Geom seems overly complex. I do like the fact that it abstracts out the 
partition and disklabel stuff, so that it can detect and deal with sun 
disklabels, Apple partitions, GPT partitions, etc. Having that in 
userspace seems somewhat less than ideal.

Which leads to the main thing I wanted to point out: Vinum is the only 
volume manager I've seen on BSD. There's a bit more to it than just raid.


