lkwt in DragonFly

Jeremy Messenger mezz7 at cox.net
Fri Feb 6 19:04:53 PST 2004


On Fri, 06 Feb 2004 18:33:47 -0800, Matthew Dillon wrote:

>     First, Bosko, thanks for asking your question!  You really got me
>     thinking about our token implementation.  I've been whining about it
>     being broken on our mailing lists for quite a while and as I was
>     formulating a reply to your email I had an epiphany.  So I hope you
>     don't mind me having CC'd your email and my reply with the kernel@
>     list!
> 
> :Matt,
> :
> :    Reading this:
> :
> :Token Passing Primitives Not Mutexes :...
> :
> :   Do you intend to have a single pcpu token on a given cpu protecting
> :   all data structures (or a class of data structures)?  The tradeoff
> of :   the lesser-granularity vs the admittedly much lower cost for a
> common :   token acquisition may or may not be worth it in the end. 
> (The other :   part to the tradeoff is that you can never go with
> full-preemption). :   I did understand this correctly, right? :
> :Cheers,
> :--
> :Bosko Milekic  *  bmilekic at xxxxxxxxxxxxxxxx  *  bmilekic at xxxxxxxxxxx
> 
>     Not entirely.  First, please note that the current token
>     implementation does not work very well... it's a mess to use because
>     cpu #B can steal away cpu #A's token by sending cpu #A an IPI
>     message requesting the token. This means cpu #A has to be in a
>     critical section to 'protect' the token, and must re-acquire the
>     token (in case it was lost) after it blocks.
> 
>     After thinking about this quite a bit I have come up with a solution
>     that, in a moment of enlightment, would make our tokens operate like
>     self-contained little RCU implementations (superior to Linux's
>     implementation, in fact!).

RCU = Remote Copy Update, right? Just asking, so I can find the right
paper to read more about it.

Cheers,
Mezz

>     I intend to fix our tokens by not allowing an IPI to steal a token
>     until the owning cpu has gone through a thread switch, or by
>     explicit release.  This means that, once fixed, our token
>     implementation will allow us to abstract RCU-like operations, which
>     I think is very important.  It is not what I originally intended
>     tokens to be but it is turning out to be what they should be.
> 
>     The nice thing about the revamp I intend to do on the token code is
>     that the RCU abstraction will wind up being a lot better then
>     Linux's RCU abstraction, because it will be compartmentalized.  Only
>     cpus competing for the same token are effected rather then all cpus
>     in the system. We would not have the serialization problem that
>     Linux has.
> 
>     Tokens will be used in a manner similar to how mutex are used in
>     FreeBSD-5, except that in our case (with the revamped token code)
>     DFly will be able to hold any number of tokens across a blocking
>     situation without creating deadlocks (but also not guarenteeing
>     atomicy across the blocking situation).  I think this is exactly the
>     sort of abstraction we need.  Mutexes in FBSD-5 and tokens in DFly
>     are primarily used as interlocks on structures, for which no
>     blocking occurs, or in situations where the blocking condition is
>     well manageed (e.g. lockmgr).  I feel that my new token (RCU-like)
>     implementation will wind up being far superior to FreeBSD-5's mutex
>     implementation for 90% of these situations and it will not suffer
>     from the deadlock potential that mutexes have because it will not
>     guarentee atomicy across a blocking call.
> 
>     Also note that several major DragonFly subsystems (LWKT scheduler
>     and SLAB allocator) are totally mutex & token free and things like
>     the network stack will also be mostly mutex & token free because it
>     will assign PCBs to particular threads (so operations on those PCBs
>     are only done by a particular thread which in turn means no locking
>     of the PCB is required at all).
> 
>     The performance trade off is basically the ability for the cpu
>     owning a token to operate on the token without having to use
>     bus-locked instructions, verses having to send an IPI message to the
>     owning cpu when the requesting cpu does not own the token.  This is
>     not the major reason for using tokens, though.  Still, judging by
>     the benchmark comparisons we've done between 4.x and DFly, the
>     overhead of a 'miss' appears to be very small.  It turns out that
>     most locks (like vnode locks) hit the same-cpu case a *LOT* more
>     then they hit the foreign-cpu case so the overall benefit is at
>     least equivalent to FreeBSD-5's mutex overhead.  It certainly isn't
>     worse.  We do lose when it comes to IPC operations between two
>     processes running on difference CPUs, but if this is the only thing
>     that winds up being a problem it is a simple enough case to fix.
> 
>     However, the major reason for using tokens is that it creates an
>     abstraction that allows us to execute many types of operations as if
>     the system had only one cpu.  Since a token is owned by a cpu and
>     not a thread, if thread #1 obtains a token and blocks, thread #2
>     running on the same cpu can obtain that same token optimally.
>     Likewise, thread #3 running on a different cpu can 'steal' the token
>     (but it only gets transfered on a scheduler boundary of the first
>     cpu once I fix the current token code).  We also have the advantage
>     of being able to use this abstraction heavily without compromising
>     UP performance, because on a UP box a blocking situations implies a
>     scheduler switch and since tokens are per-cpu... well, the whole
>     thing devolves into a NOP.
> 
>     I have some work to do to move my current token implementation into
>     this new paradigm, but I do not expect it to be difficult.  For
>     example, if cpu #B steals a token from cpu #A then the threads that
>     own the token on cpu #A can't be scheduled to run again until cpu #B
>     is through with the token.  This means actively accounting for and
>     structurally linking the dependancy information.
> 
>     Whew!  That may seem complex, and in a mutexed environment it would
>     be very complex.  But remember in DFly the whole basis of using this
>     abstraction is to be able to operate almost as if one were on a
>     single cpu even when one is on a 16-cpu system, so the actual
>     structural accounting does not really have to worry about interrupts
>     or stale data or other cpus.  A critical section is sufficient to
>     operate on token meta data on the owning cpu, and we can actually
>     defer structural linkages within tokens we own until a thread switch
>     occurs (which won't happen in most cases since like a mutex a token
>     is obtained and released without blocking most cases).
> 
>     I think I rambled on there.  Please feel free to ask additional
>     questions!
> 
> 						-Matt


-- 
bsdforums.org 's moderator, mezz.






More information about the Kernel mailing list