Performance results / VM related SMP locking work - committed (3)
Alex Hornung
ahornung at gmail.com
Fri Oct 28 21:05:29 PDT 2011
Great work!
Nonetheless I feel that the last few changes nerf a quad-core machine
way too much; you are killing 50% of what you gained in the -j48 case
for buildkernel and even worse than in the original case with -j4, which
is the most common case. buildworld -j8 on test29 also loses 50% of the
original improvement with commit 2 or 3.
I don't think this is a good trade-off at all; are we really optimizing
for 4-socket 48-core machines and letting the way more common 4-8 core
machines out?
Simply adding lwkt_yield()s all over the place doesn't really sound like
a great strategy in the first place. It sounds more like a stopgap or
debug solution for a 48-core machine than something that should be
committed (straight ahead).
Cheers,
Alex Hornung
On 29/10/11 00:28, Matthew Dillon wrote:
> 89.61 real 196.30 user 59.04 sys test29 -j4 (patch)
> 86.55 real 195.14 user 49.52 sys test29 -j4 (commit)
> 93.77 real 195.94 user 67.68 sys test29 -j4 (commit 3)
>
> 167.62 real 360.44 user 4148.45 sys monster -j48 (prepatch)
> 110.26 real 362.93 user 1281.41 sys monster -j48 (patch)
> 101.68 real 380.67 user 1864.92 sys monster -j48 (commit 1)
> 59.66 real 349.45 user 208.59 sys monster -j48 (commit 3)<<<
>
> 96.37 real 209.52 user 63.77 sys test29 -j48 (patch)
> 85.72 real 196.93 user 52.08 sys test29 -j48 (commit 1)
> 90.01 real 196.91 user 70.32 sys test29 -j48 (commit 3)
>
> Kernel build results are as expected for the most part. -j 48 build
> times on the many-cores monster are GREATLY improved, from 101 seconds
> to 59.66 seconds (and down from 167 seconds before this work began).
>
> That's a +181% improvement, almost 3x faster.
>
> The -j 4 build and the quad-core test29 build were not expected to show
> any improvement since there isn't really any spinlock contention with
> only 4 cores. There was a slight nerf on test28 (the quad-core box) but
> that might be related to some of the lwkt_yield()s added and not so
> much the PQ_INACTIVE/PQ_ACTIVE vm_page_queues[] changes.
More information about the Users
mailing list