cvs commit: src/sys/kern kern_clock.c kern_synch.c lwkt_thread.c sys_pipe.c usched_bsd4.c src/sys/platform/pc32/i386 trap.c src/sys/platform/pc64/amd64 trap.c src/sys/platform/vkernel/i386 trap.c src/sys/sys param.h thread.h
Matthew Dillon
dillon at crater.dragonflybsd.org
Mon Sep 8 21:08:22 PDT 2008
dillon 2008/09/08 21:06:20 PDT
DragonFly src repository
Modified files:
sys/kern kern_clock.c kern_synch.c lwkt_thread.c
sys_pipe.c usched_bsd4.c
sys/platform/pc32/i386 trap.c
sys/platform/pc64/amd64 trap.c
sys/platform/vkernel/i386 trap.c
sys/sys param.h thread.h
Log:
Fix issues with the scheduler that were causing unnecessary reschedules
between tightly coupled processes as well as inefficient reschedules under
heavy loads.
The basic problem is that a process entering the kernel is 'passively
released', meaning its thread priority is left at TDPRI_USER_NORM. The
thread priority is only raised to TDPRI_KERN_USER if the thread switches
out. This has the side effect of forcing a LWKT reschedule when any other
user process woke up from a blocked condition in the kernel, regardless of
its user priority, because it's LWKT thread was at the higher
TDPRI_KERN_USER priority. This resulted in some significant switching
cavitation under load.
There is a twist here because we do not want to starve threads running in
the kernel acting on behalf of a very low priority user process, because
doing so can deadlock the namecache or other kernel elements that sleep with
lockmgr locks held. In addition, the 'other' LWKT thread might be associated
with a much higher priority user process that we *DO* in fact want to give
cpu to.
The solution is elegant. First, do not force a LWKT reschedule for the
above case. Second, force a LWKT reschedule on every hard clock. Remove
all the old hacks. That's it!
The result is that the current thread is allowed to return to user
mode and run until the next hard clock even if other LWKT threads (running
on behalf of a user process) are runnable. Pure kernel LWKT threads still
get absolute priority, of course. When the hard clock occurs the other LWKT
threads get the cpu and at the end of that whole mess most of those
LWKT threads will be trying to return to user mode and the user scheduler
will be able to select the best one. Doing this on a hardclock boundary
prevents cavitation from occuring at the syscall enter and return boundary.
With this change the TDF_NORESCHED and PNORESCHED flags and their associated
code hacks have also been removed, along with lwkt_checkpri_self() which
is no longer needed.
Revision Changes Path
1.62 +9 -0 src/sys/kern/kern_clock.c
1.91 +0 -3 src/sys/kern/kern_synch.c
1.117 +20 -30 src/sys/kern/lwkt_thread.c
1.50 +4 -4 src/sys/kern/sys_pipe.c
1.25 +1 -1 src/sys/kern/usched_bsd4.c
1.115 +13 -11 src/sys/platform/pc32/i386/trap.c
1.3 +4 -11 src/sys/platform/pc64/amd64/trap.c
1.35 +4 -10 src/sys/platform/vkernel/i386/trap.c
1.52 +0 -1 src/sys/sys/param.h
1.95 +1 -2 src/sys/sys/thread.h
http://www.dragonflybsd.org/cvsweb/src/sys/kern/kern_clock.c.diff?r1=1.61&r2=1.62&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/kern/kern_synch.c.diff?r1=1.90&r2=1.91&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/kern/lwkt_thread.c.diff?r1=1.116&r2=1.117&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/kern/sys_pipe.c.diff?r1=1.49&r2=1.50&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/kern/usched_bsd4.c.diff?r1=1.24&r2=1.25&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/platform/pc32/i386/trap.c.diff?r1=1.114&r2=1.115&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/platform/pc64/amd64/trap.c.diff?r1=1.2&r2=1.3&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/platform/vkernel/i386/trap.c.diff?r1=1.34&r2=1.35&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/sys/param.h.diff?r1=1.51&r2=1.52&f=u
http://www.dragonflybsd.org/cvsweb/src/sys/sys/thread.h.diff?r1=1.94&r2=1.95&f=u
More information about the Commits
mailing list