Background fsck
Matthew Dillon
dillon at apollo.backplane.com
Mon Jan 19 16:34:36 PST 2004
:Joerg Sonnenberger wrote:
:> 4 MB pieces looked in the different areas of the disk. The option to
:> split the journal into pieces becomes useful if you have RAID or similiar
:> means in place, which can effect the overall performance dramatically.
:
:I think it could benefit performance putting a journal on a faster RAID 1
:mirror if you have a huge RAID5 array. Remember that if you have one of
:those expensive RAID controllers with battery-backed write-back cache,
:you should have the FS synced so that you have the speed-up
:of the write-back cache AND data consistency guarantee in a crash. softupdates
:alone guarantee metadata consistency in a crash only. You still lose
:data.
You would still lose data with a meta-data-only journal, just not as
much, but meta-data logs have a lot of flexibility, it would be possible
to make certain guarentees and it is also possible to log file data
as well as meta-data when the circumstances dictate.
create file
write 8k
write 8k
write 8k
...
close
In the journal this would be (the T numbers on the left indicate atomic
transactions):
T1 inode meta data reflecting a directory size change
T1 directory data reflecting the new directory entry
T1 bitmap meta data reflecting the inode allocation
T1 inode meta data reflecting the create
T2 block allocation meta data
T2 inode meta data reflecting a file size change to 8K
T3 block allocation meta data
T3 inode meta data reflecting a file size change to 16K
T4 block allocation meta data
T4 inode meta data reflecting a file size change to 24K
Directory data can be written at any point after T1 commits. That is,
directory data has to be treated the same as meta-data. File
block data can be written at any time before or after the log is
committed as long as the blocks represent new data blocks and not
a reuse of old data blocks recently deleted (that can be solved by
not actually marking blocks in the bitmap as 'free' until after
the associated meta data is committed).
What this does not do is tell us that the data blocks are good. If
we were to crash the file could be the correct size, but contain
'garbage'.
One solution to the file-data-garbage problem would be to log file
data block write completions as meta-data:
...
T80 block write complete
T81 block write complete
T82 block write complete
Then the crash recovery program could notice that a file block was
allocated but never written and either write out 0's to that block,
or truncate the file.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list