Background fsck

Matthew Dillon dillon at apollo.backplane.com
Mon Jan 19 10:36:26 PST 2004


:>     There is no reason to spread a meta-data journal all over the disk,
:>     you'll just make it go slower.  All you need is a fixed area of whatever
:>     size you choose to use as a circular FIFO.  e.g. 10MB of space would
:>     be sufficient.
:> 
:>     The seek argument is paramount.  99.9% of the time a disk spends doing
:>     an I/O is seeking.  The actual write takes no time at all.  It's the
:>     seek that matters.
:
:Yeah, but seek time is not constant. Nor are the actual writes spread
:over the complete disk. well, hopefully at least. what I meant with
:spreading the journal is simply to add the fs code an option to use
:a nearer journal part. E.g. split the 16MB (looks better) log into
:4 MB pieces looked in the different areas of the disk. The option to
:split the journal into pieces becomes useful if you have RAID or similiar
:means in place, which can effect the overall performance dramatically.

    Well, you could do this but I really doubt it would effect performance
    virtually at all, for two reasons:  First, log file writes do not have
    to occur all that often, even if you are doing something like a 
    'rm -rf'.. a few times a second at most.  Second, on a modern hard
    drive there isn't all that much of a difference between seeking a few
    tracks and seeking half the disk because it actually takes as long for
    the disk heads to settle on the target track as it does to actually seek
    the heads.  Third, linear write speeds on a modern hard drive are two
    orders of magnitude faster then seek times, so it is doubt that you would
    notice any performance degredation even if the log existed on a single
    unstriped physical disk.  And, finally, insofar as RAID goes... the log
    file writes will basically go into NVRAM first anyway, so the RAID
    system is going to be even less sensitive to seeking then the kernel.
    It isn't going to care much either.

    One seek is worth a thousand words.  In our case, the time it takes to
    seek is roughly equivalent to one track's worth of linear writing
    (usually on the order of a megabyte or so).  This is because modern disks
    do whole-track writes now more often then not.  So RAID isn't going to
    help your meta-data writes all that much.

    Now, I agree that on very large filesystems having multiple logs
    spread out over the disk would be beneficial... but it's only when
    you get into the megabytes per second of meta-data updates that
    it starts to matter.  So for an industrial-strength filesystem I 
    agree that it's important.  But I don't think it's something that needs
    to be done from the get-go.  Get the basic log mechanism working with
    a single log area first, the expand the capability to multiple areas
    later.

:[snip]
:>     In otherwords, A meta-data journal does not slow things down in the
:>     least.
:
:That's what I'm not sure of. There is at least one example of prior
:art -- ext2/ext3. IIRC many filesystem benchmark give ext3 a worse
:performance. But only implementing this properly and benchmarking it
:will give us truth.

    The linux filesystems aren't the best examples in the world.  I'm
    sure we (or you as the case may be :-)) could do a much better job.

:>     I really dislike a R/W mount on anything dirty.
:> 
:>     A journal removes the need entirely... recovery would not take very
:>     long at all so you wouldn't have to support background fscking.
:
:Technically speaking, the journal just saves the garbage collection.
:The log-replay provides the same level of functionallity as the
:background fsck. The consistence of the filesystem itself is already
:there (with softdep).
:
:Joerg
 
    Meta-data journaling would replace softupdates... that is, softupdates
    would no longer be necessasry.  Softupdate's dependancy tracking does
    horrible things to the buffer cache and to the bitmap code.

    Now, softupdates *does* do one thing that I really like... if you create
    a temporary file, mess with it, then delete it before the buffer cache
    has a chance to flush it out to disk, softupdates can remove the dirty
    buffer cache blocks so no I/O occurs at all, ever.

    But I also believe that it would be possible to do something very similar
    with meta-data journaling.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Kernel mailing list