lkwt in DragonFly

Matthew Dillon dillon at apollo.backplane.com
Sat Feb 7 13:16:14 PST 2004


:> RCU = Remote Copy Update, right? Just asking, so I can find the right
:> paper to read more about it.
:
:Read-Copy-Update. It's a read-mostly algorithm that's used to cluster data
:structures writes/changes, so that it minimizes bus traffic across processors.
:
:http://lse.sourceforge.net/locking/rcupdate.html
:
:What Matt is talk about is merging a kind of quiescience operation into dfBSD
:tokens if I understand correctly.
:
:bill

    Right.  My new idea for the token API gives us a scheduler-level
    serialization abstraction that works across the whole system.  It
    would kinda be like having multiple 'MP' locks in that it would have
    the same sort of side effects as an MP lock.

    The traditional serialization abstraction is the concept of a 'mutex',
    which is what FreeBSD-5 uses.  Linux tries to avoid mutex overheads
    by using RCU (anyone can read data without having to worry about it
    being changed out from under, but modifications must be synchronized
    across all cpus at a scheduler boundary).  Right now we try to avoid
    mutex-like overheads through our per-cpu model (e.g. LWKT scheduler
    and SLAB allocator) and through our per-cpu tokens.

    The problem with our current token abstraction is very similar to the
    problem FreeBSD-5 has with its mutex abstraction.  In our case, a 
    token can be 'lost' when a thread blocks or gets interrupted outside
    of a critical section and the thread must carefully code re-acquisition
    of potentially lost tokens because of this.  In FreeBSD-5's case a mutex
    is not allowed to be held across a blocking operation at all and so
    the use of mutexes must be carefully coded... adding significant complexity
    to deeply nested APIs.  Mutexes also have lock ordering issues (deadlock
    potential).

    My new token idea takes the original per-cpu idea and abstracts it
    into a serialization model.  We could, in fact, even support 'reader'
    and 'writer' refs to tokens in order to allow multiple readers to
    hold the same token at the same time.  The key difference between 
    the new token idea and, say, a standard lock, is that the token is
    only in effect while the thread that acquired it is running.  When
    the thread blocks or switches away the tokens that thread has acquired
    can be borrowed by other threads (on the same or on other cpus) and
    must be returned before the original thread can resume.  

    So, unlike mutexes or locks, a token only guarentees run-time 
    serialization.  A token also has no deadlock issues and no lock
    ordering issues at all.  Like mutexes, code that uses tokens must
    be aware of potential blocking conditions (because atomicy is lost
    across a blocking operation), but unlike mutexes the code that
    uses tokens does not have to release the tokens across the blocking
    operation.  I believe this to be a VERY IMPORTANT feature of the
    new token design because it means we can (A) hold multiple tokens
    across a blocking situation, (B) higher level procedures do not need
    to pass token references to lower level procedures that 'might block',
    they need only be aware that the lower level procedure might block,
    and (C) the new token abstraction leaves open the possibility of 
    major implementation optimizations because it devolves nearly into
    a NOP on UP systems.  e.g. we can support per-cpu 'caching' of the 
    token after it has been released, allowing us to avoid locked bus
    cycles as well as memory barriers in the cache case.

    So how does this tie into the RCU concept?  Simple, really... RCU
    turns out to be a special case of the new token abstraction, that's
    all.  In the new abstraction if you had a token T and every thread
    in the system obtained 'read access' on T, then any thread which
    then obtains 'write access' on T would be serialized against all
    the threads with read access at the scheduler level.  If we wanted to
    implement Linux's 'queued' RCU API then we would queue structural 
    changes directly to a thread that obtains 'write access' on T
    and this translates directly to updating the data during
    quiescent-scheduler periods.

    I actualy don't like Linux's queued RCU API, I am just saying that
    it could be implemented natively on top of the new token abstraction.
    We would use our tokens in a more active manner.  e.g. instead of
    holding a read token and queueing write udpates to another entity that
    holds the write token we would obtain a write token and do the 
    update in real time, then release the write token.  Nothing prevents
    us from doing it both ways, though :-)

    So what does this mean?  This means that the new token abstraction
    will not be a replacement for standard reader/writer (lockmgr) locks,
    but it certainly would be an excellent replacement for mutexes
    because it has far fewer side effects, requires less knowledge of
    the lock state to be passed down the subroutine chain, and is far more
    flexible.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Kernel mailing list