git: kernel - Scale tsleep() performance vs many (thousands) of processes
dillon at crater.dragonflybsd.org
Sun Aug 13 00:20:22 PDT 2017
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date: Sat Aug 12 23:35:47 2017 -0700
kernel - Scale tsleep() performance vs many (thousands) of processes
* In situations where a huge number of processes or threads are present
and sleeping (that is, more than a few thousand), the global cpumask hash
table used by tsleep() would saturate and effectively cause any wakeup()
call to broadcast to all CPUs.
* Refactor the tsleep initialization code to allow the global cpumask
hash table and the pcpu hash tables to be dynamically allocated.
* Allocate a MUCH larger global cpumask hash table, and significantly
smaller pcpu hash tables. The global cpumask hash table is now
sized to approximate 2 * maxproc, greatly reducing cpumask collisions
when large numbers of processes exist in the system.
The pcpu hash tables can be smaller without effecting performance. This
will simply result in more entries in each queue which are trivially
Nominal maxproc ~32,000 -> in the noise (normal desktop system)
Nominal maxproc ~250,000 -> 16MB worth of hash tables (on a 128G box)
Maximal maxproc ~2,000,000 -> 122MB worth of hash tables (on a 128G box)
* Remove the unused sched_quantum sysctl and variable.
* Tested with running a pipe() chain through 900,000 processes, the
end-to-end latency dropped from 25 seconds to 10 seconds and the
pcpu IPI rate dropped from 60,000 IPIs/cpu to 5000 IPIs/cpu. This
is still a bit more than ideal, but much better than before.
* Fix a low-memory panic in zalloc(). A possible infinite recursion
was not being properly handled.
Summary of changes:
sys/kern/init_main.c | 5 +-
sys/kern/kern_synch.c | 257 +++++++++++++++++++++++++++++++-------------------
sys/sys/kernel.h | 1 +
sys/sys/proc.h | 2 +-
sys/vm/vm_zone.c | 36 +++++--
5 files changed, 196 insertions(+), 105 deletions(-)
DragonFly BSD source repository
More information about the Commits