Hammer Benchmark Fun
Matthew Dillon
dillon at apollo.backplane.com
Sat Jan 8 11:52:48 PST 2011
Well, the numbers are all over the place and some of them don't make
any sense, which kinda implies cockpit trouble somewhere.
The blogbench tests make some sense, but people will get a false sense
of performance (or lack of) because the authors don't quite understand
how blogbench works. Blogbench uses an ever-growing data set size so
if write performance is horrible the concurrent reading will all fit
in the buffer cache because the data set size simply does not grow
very much during the test. The read performance will of ocurse be very
high in that case because it won't be going to disk at all. If
write performance is good then read performance is going to suffer
greatly simply due to the fact that the data set size is much larger
(due to the improved write performance) and thus typically does not
fit in the buffer cache.
Another major problem with this test suite depends on whether e.g.
ZFS is doing compression or data de-dup or not (and the other tests
too). Most filesystem benchmark programs do NOT write random data
into files. They typically write all zero's or the same pattern.
Needless to say when you write files full of zeros or use the same
pattern on a filesystem which does compression and/or de-dup you are
going to have a very, VERY high apparent performance. But it isn't
real performance, because in real life that level of activity is not
writing all zeros to files.
In particular, I just have to question the BTRFS and the ZFS numbers
for most of these tests. From the looks of BTRFS isn't going to disk
virtually at all. That kinda implies its built-in compression is
trivializing the data set being written by the benchmark programs,
skewing the results badly.
The gzip tests make no sense. gzip is cpu bound. It looks to me
that the linux tests are running an optimized version of gzip. This
isn't testing the filesystem at all.
The postmark tests... you have to look at whether postmark is running
fsync() and then you to look at the fsync handling mode for each
filesystem. It's sad to say but most filesystems do NOT handle fsync()
calls properly. So if you are going to run those sorts of tests you have
to be cognizant of the issue and at least have the filesystems set up
to run fsync the same way so the tests are more realistic. The content
of the files must also be checked. And, again, the ext4 and btrfs
numbers look just plain wrong. There's something going on in there
that is shortcutting the test.
Compression in general is a very interesting dynamic that will probably
become more and more important as cpu power continues to increase,
particularly in MP environments. But if you are going to run filesystem
tests you need to be sure you are testing the same thing and not
hitting degenerate conditions due to e.g. the test data.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list