Background fsck

Joerg Sonnenberger joerg at britannica.bec.de
Mon Jan 19 10:54:12 PST 2004


On Tue, Jan 20, 2004 at 02:22:01AM +0800, Xin LI wrote:
> While it is possible to keep metadata consistency, it's *required* to write
> the journal data synchronously, which will decrease performance. Without
> doing the journal writes synchronously we are under the risk of losing
> important metadata updates, and even lost the chance to bring the filesystem
> into a consistency state.

This is not true. The journal itself can be written async just like
softdeps allows them to be written async. The important point is that
both needs ordered writes.

> By marking individual cylinder groups dirty or clean, we will have to
> initialize more write operations, which will also slow down the filesystem
> operation. It's questionable how these code will be more simple compared to
> soft updates and we have to maintain it alone (not to share the fixes

That's a valid point and I didn't really give it any additional thought.

> available in FreeBSD). What's more, by having a journal we will face to
> increased abrasion (because journals must be stored in some places which
> location are fixed and the write of journals are usually rolling) on certain
> parts of disks and in my opinion that are unnecessary when there's soft
> updates available.

That's very important for things like USB sticks or DVD?RW. Or other
kinds of read-many write-not-so-many storage ;)

> Finally, using of journalling requires modification to the metadata format,
> which will lead to some problem when users upgrade their system, so a
> converter might be necessary of DFly finally chooses this approach.

Well, should be possible to add it as reversible option. E.g. allocate
storage and removal of storage after log replay are quite doable.

> >     I really dislike the concept of a background fsck.  I don't trust it.
> For a successful background fsck to be finished, we must at least guarantee
> soft updates rules to be enforced, and this, for either FreeBSD-CURRENT,
> FreeBSD-STABLE and DragonFly, were violated in several ways. For instance,
> to improve performance, hw.ata.wc was enabled by default and because IDE
> Hard disks will "cheat" the operating system about the write results,
> softdep code may hence write some incorrect data to disk (because it
> "intelligently" re-order some writes which is not wanted by the OS).

hw.ata.wc is bad in multiple ways. it can distrupt anything, even a
journaled filesystem.

> A second reason why background fsck is questionable might because the
> snapshot code is still in its alpha-quality stage and is not mature enough.

Don't want to use snapshots, have a look at my original mail.

> Considering that the snapshots are usually ephemeral (because they are
> usually used to backup or have a background fsck), I think it might be
> possible to implement the whole SoftUpdates policies in the VFS layer, as
> David Rhodes pointed out in a recent post. Unfortunately, this apparently
> will drastically increase the complexity of VFS code and it is questionable
> whether this is worthy to have a try.

Full ACK.

> 
> The only thing I am worrying about background fsck is, while we can mount a
> dirty filesystem and run fsck in the background, it may turn out that an
> incorrect reference number on an i-node may cause it impossible to remove it
> before the bgfsck is finally done... This will sometimes cause application
> to crash...

The application semantic is not changed in any important way. Normal
applications are just using unlink(2) and closing all open descriptors.
If no references or open descriptors are left, the file is removed. With
the background fsck, the kernel is advised to adjust the reference count,
check wether it is zero and delete it, if noone has a fd left.

Joerg





More information about the Kernel mailing list