Interesting ubench scores for FreeBSD 4.11, 5.4, 6.0beta3 and DFly-Preview

Sun Sep 4 11:15:31 PDT 2005

--- Kris Kennaway <kkenn at xxxxxxxxxxxxxxxxxx>
wrote:

> On 2005-09-03, Toma¾ Bor¹tnar
> <tomaz.borstnar at xxxxxxxx> wrote:
> > Kris Kennaway wrote:
> >>>Summary
> >>>Latest DFly-preview has best Ubench AVG of
> 106030, because of 2nd best memory score
> (124353) and balanced CPU score 
> >>>(87707). Also good was FreeBSD 6beta3/amd64
> with Ubench AVG of   101448, because of
> balanced CPU (102455) and memory 
> >>>score (100441). Seems like extra registers
> in long mode help quite a bit. Third was
> FreeBSD 5.4/i386 with Ubench AVG of 
> >>>    94304 - mostly because of best memory
> score (130602) and not so good CPU score
> (58006). Hardly behind was FreeBSD 4.11
> >>>with Ubench AVG of 94296 - decent, but
> slower CPU score (84431) and very nice memory
> score (104161). FreeBSD 6beta3/i386
> >>>was severly behind all of them with Ubench
> AVG of 55968 - with cpu of 59138 and memory of
> 52799.
> >>>
> >>>I wish DFly had better CPU score :) But it
> still has best score among 32bit OS systems!
> >>>
> >>>As usual, YMMV :)
> >> 
> >> 
> >> Did you remember to disable the debugging
> features in FreeBSD 6
> >> (WITNESS, INVARIANTS, malloc debugging)?  If
> not, you're incurring a
> >> significant (usually >30%) performance
> penalty, which is consistent
> >> with your numbers.
> >
> > I checked FreeBSD 6beta3/i386 again - i
> forgot to reboot with new
> > kernel which I did this time. The only
> visible change is much better
> > memory score - 98040 this time(!), while CPU
> was 59062. Almost 100%
> > more from 52799 with GENERIC debug kernel!
> But average score is
> > still low compared to others - 78551.
> 
> I did some tests on a UP pentium 3 machine, and
> confirmed that most of
> the difference between FreeBSD 4.x and later
> versions is due to the
> different compiler being used to compile the
> code (gcc 2.95 vs 3.4).
> gcc 3.x is known to have wildly different
> performance characteristics
> than 2.x (better on some code, worse on other).
>  When I retested 5.x
> and above with a FreeBSD 4.x binary (statically
> linked), I found
> somewhat different results.
> 
> I ran at least 10 tests on each platform and
> then used the ministat
> tool (/usr/src/tools/ministat on freebsd) to
> perform a statistical
> comparison.
> 
> When using the same binary, the CPU scores are
> statistically
> indistinguishable between the different FreeBSD
> versions.  This makes
> sense since there's little kernel involvment in
> running userland
> integer/FP computations.  When running the gcc
> 2.95 binary all
> versions of FreeBSD were 31% *faster* on this
> test than when running a
> gcc 3 binary (both compiled with -O only).
> 
> FreeBSD 5.x and above show a 6.3% drop on the
> memory test relative to
> 4.x (with the same 4.x binary).  I reran ubench
> with kernel profiling
> enabled and found that this drop is mostly due
> to the vm locking
> present in FreeBSD 5 and above (via vm_fault). 
> This locking is also
> responsible for the dramatic performance
> increases on SMP machines
> seen in other benchmarks, so it would be more
> interesting to test on
> SMP machines.  I'm not set up to do this on my
> hardware though.
> 
> The memory test showed about a 14.6% benefit
> from using the gcc 3
> binary vs gcc 2.95.  This is presumably due to
> better code generation
> in e.g. memset/memcpy.
> 
> In summary: the CPU test is a priori not very
> useful for comparing
> performance of different OSes, because it is
> largely a test of the
> code generated by the compiler (unless you can
> eliminate this
> variable).  The lesson is that if your
> application includes a lot of
> CPU intensive code, you should carefully
> benchmark its performance
> with different compilers and optimizations to
> see what works best.
> Thus, the 'average' number is also not
> meaningful as a point of
> comparison since it is tainted by this fact.
> 
> The memory test is a bit more meaningful since
> it contains a kernel
> component, but it's still influenced quite a
> lot by the compiler, so
> it's also hard to compare directly unless you
> can eliminate that
> variable.
> 
> Kris

While I agree with your conclusion generally, I
think your logic is wrong here. Since the
compiler version used to build an OS is a
fundamental propery of the OS, I can't see how
you can discount it as a factor in an OS being
fast or slow. Its not practical to use something
else. Using different optimizations is another
matter; as that's a practical alternative for the
average user.

The problem with ubench is that its utterly
useless in testing an OS, because it doesn't
exercise the kernel, which is what makes or
breaks the OS. It is also likely that it runs
entirely in the CPU cache, which makes it even
less than useless.

And aren't memcpy and memset written in
assembler?

DT

____________________________________________________
Start your day with Yahoo! - make it your home page 
http://www.yahoo.com/r/hs