Update on recent SMP contention work (2)
Matthew Dillon
dillon at apollo.backplane.com
Thu Oct 24 22:20:40 PDT 2013
Another veritible ton of SMP performance work has gone into master.
* The entire [v]fork/exec/exit/wait path has been streamlined and
essentially no longer have any SMP contention.
* The pid/process-group/session mechanics have been rewritten and
essentially no longer have any SMP contention.
* The entire VM fault path, particularly for initial COWs on binaries
(to support concurrent exec's) is now able to use shared fine-grained
locks end-to-end and have no SMP contention.
As in zero. Running 8 threads doing fork/exec/wait of an ELF binary
on one of the Haswell blades went from 10.60 seconds for 80000 total
execs to 3.8 seconds. That's a multi-fold 2.7x improvement in
performance.
* tmpfs performance has been radically improved. It turns out that
most of the code was fine-grained locked but still had coarse-grained
per-mount locks wrapped around most of the VNOP operations. I took
pass on it and removed most of the coarse-grained locks.
The previous block of work got rid of 90% of the contention on the
smaller systems (the 4-core/8-thread blades), but were lacking on the
bigger system (monster's 48-core opteron).
This most recent set of work has gotten rid of 98% of the contention
on the smaller systems and probably 90%+ of the contention on monster.
The only system paths which still have noticable contention are the
filesystem write paths.
--
Bulk package builds (dports) on monster are under test now, no results
yet but the last week or two has brought the full build for 20,000+
packages, from scratch, down to around 15-hours. The current tests
should be able to beat that.
As with prior work, there may be some instability. I will continue to
work through what bugs show up and exercise various subsystems such as
swapcache and paging under heavy loads to locate and fix whatever
problems show up.
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Users
mailing list