kernel work week of 3-Feb-2010 HEADS UP

Freddie Cash fjwcash at gmail.com
Fri Feb 5 07:29:19 PST 2010


On Thu, Feb 4, 2010 at 7:18 PM, Matthew Dillon <dillon at apollo.backplane.com> wrote:

:Is the concern that people would be more inclined to remove an SSD than a
:regular drive by mistake, or that splitting off the log could lead to an
:"oops, I forgot that the log was separate" situation when changing out
:drives?  Or something else?
:
:It seems like an odd thing to worry about, to be honest.  If you can't
:trust users not to start removing important components from their
:systems...
:
:MAgnus

    Well, true enough.  I guess the real issue I have is that one
    is dedicated a piece of equipment to a really tiny piece of the
    filesystem.  Though I can't deny the utility of having a fast fsync().
    If the storage system is big enough then, sure.  If you're talking
    about going from one physical drive to two though it probably isn't
    worth the added complexity it just to get a fast fsync().
 This would be a setup similar to the ZFS L2ARC (cache) and SLOG (separate log device).The cache device is one or more read-optimised (ie MLC) SSDs.  Any data that would be ejected from the in-memory ARC is then written to the cache device.  Any future reads of that data are pulled from the cache device instead of from disk.  These should be as big and as fast (for reads) as possible.  It's basically treated as extra "RAM".
The separate log device is a mirrored pair (redundancy is critical for this part) of write-optimised (ie SLC) SSDs.  Any block writes smaller than 64K go directly into the ZIL and marked as "written to disk" while also being queued for writing to the pool.  If the server crashes, the ZIL is read and any transaction groups that are missing from the pool are copied over from the ZIL.  If the server never crashes, the data in the ZIL is never actually used.  In most cases, the ZIL only needs to be a few GB in size.
Until very, very recent versions of ZFS, removing log devices from a pool was impossible, so if it died, the pool was unusable and all data lost, which is why using mirrored sets was important.  One can now remove log devices, which moves the ZIL back into the pool.
This would be similar to the swap cache on MLC SSD, and the UNDO log/FIFO on SLC SSD.-- Freddie Cashfjwcash at gmail.com




More information about the Kernel mailing list