HAMMER changes in HEAD, also needs testing
Matthew Dillon
dillon at apollo.backplane.com
Sat Jun 27 18:34:38 PDT 2009
:I've done some performance tests on a HEAD system from a few days before
:the change, and just after. A ~100GB tar file was extracted to a local
:HAMMER file system, and a cpdup copy of the files was made, result was
:traversed with find. After updating system, to include HAMMER & cpdup
:changes of June 20th, same thing was done again.
:
:Unpacking and cpdup'ng was slower after updating, but find was faster.
:Also after updating traversing cpdup copies was faster for copies made befo=
:re update.
:
:This wasn't what I expected from the description of the changes, any comm=
:ents?
Yah, the B-Tree organization is a bit better after the change, so
find is a bit faster, but there are still some major latencies
in getdirentries (if you ktrace the find you can see where the
hangups are). Basically getdirentries and the stat() of the
first file in any given directory generates extra disk seeks
and 18-40ms of latency.
The only way to improve those latencies is to move the B-Tree elements
related to directory entries into the same localization block as the
inode elements. I can do that for newly created directories but they
will not be compatible with older versions of HAMMER, so I have to
test that it actually improves matters before I make it available.
The extraction issue is another data locality of reference problem.
I actually did seem to make it a bit worse I think because the data
blocks are getting a bit more spread out with the new B-Tree changes.
However, it isn't so much worse as your tests came up with :-)
Another thing you need to do is run a reblocking operation and even
a rebalance and test the extraction again. Find should get a bit
better and extraction should remain about the same.
I won't be able to tackle the extraction problem until I get the
directory entry (find, ls -lR, etc) stuff dealt with. The extra
seeks done by the directory scans tend to blow away the hard drives
internal cache so if I can reduce the seeks the extraction should get a
lot faster.
:During tar file extraction after updating system, I noticed, looking at
:'hammer iostats 10', that inode-ops, file-rd & file-wr stalled for long
:periods of time, like 1-2 minutes; dev-read & dev-write was still high;
:after stall inode-ops would rocket up, but only for half a minute,
:then a new stall started. During stall, it seemed all file system
:operations was also stalled, including processes that I only can imagine
:was using NFS (but this can't be true I guess).
What is happening is that you are seeing the inode flush operation.
This is why the device ops goes up and the frontend ops goes down.
The meta-data builds up in memory and has to be flushed to disk.
You can monitor this by looking at vfs.hammer.count_records
and vfs.hammer.count_iqueued. When too much has built up the
flush starts running in the background, but the rate at which files
are being extracted is so high they run up against the limit while
the flush is running and stall out. It shouldn't take 1-2 minutes
per flush, but it is pretty nasty if there are lots of tiny files
being created.
The meta-data flush is not very efficient. Part of the problem is
the way the inodes are sequenced but it's a tough nut to crack no
matter how I twist it, particularly when extracting large directory
trees. Large directory trees have tons of directory entry dependencies
and the flush operation has to flush the directory entry data in a
particular order to ensure that crash recovery doesn't leave
sub-directories disconnected.
I think, generaly speaking, a tar extraction is never going to be
very efficient with HAMMER. I think tar archive creation and read
scans can be made considerably more efficient.
:- BEFORE
:root at bohr# time tar xf /hammer/data/hammer.tar
: 15996.71 real 207.43 user 1768.43 sys
:root at bohr# time cpdup pre pre.cpdup
: 43937.88 real 206.70 user 4339.66 sys
:root at bohr# time find pre | gzip >pre.find.gz
: 847.99 real 19.55 user 185.58 sys
:root at bohr# time find pre -ls | gzip >pre.find.-ls.gz
: 1171.63 real 88.60 user 501.59 sys
:- HEAD system from June 20th, including HAMMER changes
:root at bohr# time tar xf /hammer/data/hammer.tar
: 21740.77 real 203.77 user 1752.30 sys
:root at bohr# time cpdup post post.cpdup
: 72476.51 real 204.83 user 4041.41 sys
:root at bohr# time find post | gzip >post.find.gz
: 488.17 real 13.37 user 126.37 sys
:root at bohr# time find post -ls | gzip >post.find.-ls.gz
: 854.08 real 65.07 user 397.89 sys
I think there might be another issue with those tar xf
and cpdup times. That's too big a difference. Part of it
could be the fact that as the disk fills up you are probably
writing to inner disk cylinders which have considerably less
bandwidth then the outer cylinders, but even that doesn't
account for a 30,000 second difference.
Make sure that other stuff isn't going on, like an automatic
cleanup. You ran your test over 20 hours. I'm guessing
that the hammer cleanup cron job ran during the test.
Also, the daily locate.db cron job probably ran as well. You
have to disable that too because it will probably take forever
to run through a partition full of test files.
:root at bohr# time find pre.cpdup | gzip >pre.cpdup.post.find..gz
: 362.19 real 14.24 user 124.45 sys
:root at bohr# time find pre.cpdup -ls | gzip >pre.cpdup.post.find.-ls.gz
: 619.08 real 65.47 user 314.53 sys
:root at bohr# time find post.cpdup | gzip >post.cpdup.post.find..gz
: 450.81 real 14.18 user 134.78 sys
:root at bohr# time find post.cpdup -ls | gzip >post.cpdup.post.find.-ls.gz
: 813.99 real 68.53 user 408.46 sys
These are more inline. After a reblock/balance the finds
should go a bit faster.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Kernel
mailing list