HAMMER changes in HEAD, also needs testing

Matthew Dillon dillon at apollo.backplane.com
Mon Jun 22 10:06:54 PDT 2009


:Dennis Melentyev wrote:
:>>    I also made a minor change to cpdup to not use temporary filenames
:>>    when doing a fresh copy (when the target file does not exist).  To test
:>>    HAMMER with cpdup you need the new cpdup.  The old cpdup always creates
:>>    a temporary file and then rename()'s and that unfortunately breaks the
:>>    optimization that HAMMER makes to order the inode numbers.
:> 
:> Just not sure was it the right decision. AFAIU, tmpname->rename was
:> there because of atomicity of file operation, so the file with correct
:> name appear fully written.

    It's a good point.  I would argue that simultanious access is a
    problem specific for ftp/www access, and in that case we can add
    a new option to cpdup (say, -t) to force the use of .tmp files in
    all cases instead of just in the overwrite case.

:I see two different possible optimization ways:
:
:- Determine inode number from the parent directory inode number, not from 
:the filename hash.  This won't give linearity within the directory scan, 
:but will give rough spatial locality vs temporal locality (i.e. adding a 
:file to a directory every day won't make hammer jump all over the place). 
:  This is very important in my eyes.
:
:- Implement "link data extents from other file" ioctl.  Just write data to 
:a tmp file, then create the dst file with the data extents of the temp 
:file.  Like a hardlink, just on the backend fs layer.  That's cute, but 
:not really necessary.
:
:cheers
:   simon

    Now that I have NCQ operational with AHCI I have been able to do more
    specific performance tests with find, tar, ls -lR, etc.

    Right now HAMMER's biggest issue with the likes of find or ls is
    when it calls getdirentries() and when it lstat()'s the first
    entry in the directory.  If the information is not cached the
    buffer cache both case take an excessive amount of time.

    In both cases I think the mistake I made was putting directory
    entries and inodes in different localization domains in the B-Tree.
    Thus when a directory is pushed into and we want to read its directory
    entries, significant seeks are required to access those entries relative
    to the directory inode.  Similarly when we lstat() the first entry
    in a directory, significant seeks are required to look-up the inode(s)
    because they are nowhere near the directory entries.

    I believe I can improve this by moving directory entries into the inode's
    localization domain in the B-Tree, at least for newly created directories.
    Then the B-Tree elements for the directory entries will actually be
    near the B-Tree elements for the related inodes, and the directory entry
    data will be near the inode data.  I'm testing this now.

    A larger problem is simply the extra seeks required to traverse a
    directory tree, and fixing it would require a major change to HAMMER's
    media structures in order to embed directory entries and inode data
    directly in the B-Tree.  I'll have to save this for some future HAMMER2
    because there is simply no way to maintain media compatibility with
    older versions if I do it.  And it's major surgery on top of that.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>





More information about the Kernel mailing list