Lockless algorithms [was Re: splxxx replacements?]
Matthew Dillon
dillon at apollo.backplane.com
Fri Nov 17 09:15:42 PST 2006
:Matt,
:
:What's going on with DragonFly ? It might be a good thing to have some
:technical discusssion on commit be summarized here as well. Apparently
:you folks decided to move away from tokens in favor of straight
:spinlocks since they were a pain in the ass to use and there was no public
:mention about this.
:
:Another thing I've heard is that your route table stuff went lockless,
:which is a significant change and worth noting as well as discussing. And
:fron IRC discussions, RCU is starting to be pushed by various people,
:correct ?
:
:What's going on with you folks and lockless algorithms ?
:
:bill
The route table work is rather significant, though it won't really
shine until Giant is removed from the protocol threads. Basically
the route table is now replicated across all cpus and thus can be
accessed by any given cpu with only a critical section for
protection.
We still use tokens in several major areas, but we have been shifting
the ones that were only being used to lock short codepaths over to
spinlocks.
Basically DragonFly has five locking primitives:
* Critical Sections
Prevents preemption, Interrupts, and IPIs from occuring on the
current cpu only. Automatically released if the thread switches
away and is reacquired when the thread switches back. Typically
used to interlock operations local to the cpu.
Critical section calls are, for all intents and purposes, free.
They are implemented with just a small handful of instructions and
no locked bus cycle instructions. Overhead is less then 5 nS.
* Spinlocks
May only be used to lock very short code paths. Severe restrictions
on use. May not block while holding a spinlock. Automatically
enters a critical section. Deadlocks automatically detected and
will panic the system.
Spinlocks do not block or cause a thread switch to occur and can
be used in code paths that are not allowed to block.
It should be noted that spinlocks are NOT in any way similar to
FreeBSD's mutexes. In DragonFly spinlocks are considered to be
very, very low level mechanisms that are not to be held across
complex code paths.
* Tokens
May be used to lock long code paths. May be nested. Automatically
released if a thread blocks or switches away (not released on
preemption, only on explicit switches), and is reacquired when the
thread resumes. That is, guarentees atomicy within the code domain
the token was acquired for only while the thread is running. If
the code domain blocks the token is released and reacquired on
resume.
Tokens may block and may cause a thread switch to occur and can
only be used in code paths where blocking is ok.
* LockMgr Locks
May be used to lock long code paths. May be nested. NOT released
when a thread blocks. Has a fairly heavy-weight API (but I have
simplified it enormously over this development cycle).
* Serializing locks
Serializing locks are basically a simplified version of exclusive
lockmgr locks. They are allowed to block on aquisition and are
held across thread switches and blockages. The interrupt subsystem
tightly integrates serializing locks to efficiently manage
interlocks between mainline code running for a driver and the
interrupt handler for that same driver.
These effectively replace spl's, but only from the point of view
of any particular driver. Most spl's in the system are unrelated
to drivers and are more appropriately turned into critical sections.
I may remove serializing locks in favor of LockMgr locks due to
the simplification that has occured in the LockMgr API over the
last year.
--
RCU is interesting but I personally prefer replication and
cpu-localization instead of RCU. So far nothing has come up that really
needs an RCU implementation. RCU suffers from a level of complexity
that makes it somewhat difficult to understand the code using it, and
I don't like that aspect of it.
The use of tokens in the system is primarily being relegated to situations
that actually need the above characteristics. A very good example of
this is the traversal of system structures such as the system mountlist.
An example of incorrect use would be in, say, kern_objcache.c, where we
really ought to be using a spinlock there instead.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Kernel
mailing list