Benchmarking
Matthew Dillon
dillon at apollo.backplane.com
Sun Mar 28 10:09:57 PST 2004
:I finally got some time to do DragonFly; the (somewhat suprising)
:results can be seen here: http://geri.cc.fer.hr/~ivoras/osbench/osbench.sxc
:
:It seems that DF could do a lot better but there's something holding it
:back (especially in the web CMS test). I got a lot of 'too many database
:connections' errors, like the db server doesn't get the chance to free
:the connection when it's closed so that the next client can use it.
:Also, under DF a bug in a program was reveiled: an object was not
:protected by a mutex/lock like it should be. It is interesting that the
:bug had not manifested on all the other tested systems, but on DF it was
:consistent and frequent (maybe the process-switching rate in DF is higher?).
:
:(This is work in progress, and I will create a world-readable report in
:PDF or HTML when it's finished. Until then, the above document will be
:the only published data).
Yes, the recent scheduler changes were actually a little too sensitive
when it came to preemptive switching of user processes when returning
from a system call. I actually softened it up a bit last night. Forked
children are also given a lower initial priority, so the initial spread
of cpu over the process set is very different. This combination seems
to do a good job bringing out MP bugs though that wasn't the intention!
When I first made the scheduler changes all my -j N buildworlds started
failing (due to missing dependancies in the Makefile's)!
'Too many database connections' errors sound like something that could
be tuned, but without more information it's hard to guess at what the
problem is. Maybe a soft file descriptor limit is being reached or
something. The CMS transaction rate issue is probably either related
to the reported error creating issues, or it is related to the over
active scheduler (which hopefully was fixed last night). The other
CMS numbers look right.
The ByteBenches are generally a function of the compiler. FreeBSD-5 is
using GCC3 by default and on modern hardware it tends to produce better
code, hence the drystone results. The execl throughput is a little
low, I was hoping it would be at least as good as 4.9, but at least it's
better then 5.x :-). The scheduler fixes *might* improve those numbers.
The pipe-based context switching looks reasonable. I actually would have
expected FreeBSD-5 to win here because their PIPE code is totally
giant-free. The shell script performance is rather odd, I'm not sure
I believe the FreeBSD-5.2-CUSTOM number there.
Bonnie++ looks inline with expectations. DragonFly should generally have
similar performance to 4.9 (i.e. better then 5.x). I'm not sure what
is going on with the Per-Char numbers but it isn't something we would
normally care about. NetBSD is obviously faking something there (probably
doing some caching even when told not to). Uncachable VFS operations
(Sequential Craete, Random Create) are going to be a bit slower on
DFly verses FreeBSD-4.9 due to the serializing token overhead. I'm
actually a bit surprised that DragonFly is beating out FreeBSD-5.x
there, perhaps FreeBSD-5.x is not being compiled with the same filesystem
optimizations (like UFS_DIRHASH). I may have bumped up the cache limits
for some of them in DragonFly and it just happens to fit the dataset.
Someone in FreeBSD-5 land should probably investigate the low numbers.
Note, however, that DragonFly's %CPU numbers scale very well to the
bench results. This seems to indicate that the issues are
concurrency/blocking related rather then hoggy code.
I don't know what ubench is doing, I would expect that since it is
a userland cpu-bound program that the numbers would be tied to the
compiler and thus similar to 4.9. The numbers aren't bad, just not
expected given the uniformity of the results from the other OSs.
If it is taking a lot VM faults then this could actually be related
to a recent pmap bug fix that is in DFly but I think was put into
FreeBSD-5 after the 5.2 release.
The PG TPS numbers look about right. What you are seeing is SMP
overhead. FreeBSD-4.9 is probably serializing/batching the operations
more (which always makes raw TPS numbers look better), but if so this
is normally not observable unless you also measure the standard deviation
of the transaction latency. That's why raw TPS numbers make for bad
benchmarks. It's just too easy for a broken scheduler to revert to
batching and make them look better then they really are.
CMS: already covered. I'd be interested in knowing whether rerunning
that test with the latest kernel (and after figuring out what is causing
the error messages) improves the numbers any.
Buildworld tests? Building the same world or each project's own worlds?
You can't really compare buildworld times because the projects have vastly
different data set sizes. For example, DragonFly rebuilds a lot more
when you run 'buildworld' then FreeBSD... it's rebuilding the entire tool
set rather then just some of the tools, and it's building two different
compilers instead of one. It's going to take longer, generally.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list