new spinlock rules, MP work update

Matthew Dillon dillon at apollo.backplane.com
Sun May 21 18:44:22 PDT 2006


    Since the VFS work is stuck until I can redo the vnode locking, which
    in turn is heavily dependant on the MP work, I've decided to make a
    big MP push.  I hate stalling out on the vnode code but there are so many
    interdependancies in the codebase, figuring out the correct order for
    all the work I want to do is not easy.  I wind up having to nibble on
    one subsystem until I get stuck, then nibble on another, then a third,
    then the first gets unstuck due to the other work and I can go back to
    it, etc.  It can get frustrating at times!

    So now that I've gotten stuck on the VFS subsystem, I have to move 
    the subsystem responsible for getting me stuck, which is the MP work.

    After some significant experimentation I was able to come up with a
    spinlock design that is optimal for MP operation.  This design will
    allow us to use several spinlocks in critical code paths with virtually
    no impact on performance.  The design extends Jeff's original spinlock
    design.   I committed it this morning.

    A core aspect of the new spinlock design is its extremely low overhead
    lock and unlock for 'shared' spinlocks (aka for read-only access
    to structures).  The total overhead is less then 10ns in nearly all
    cases and most especially in the multiple-access-shared-spinlock case.
    Virtually all of the overhead is due to a required memory fence op
    in spin_lock_rd().  However, this operation is local to the cpu and
    does not create any cpu cache conflicts between cpus.

    A thread may hold any number of exclusive spinlocks at a time but may
    only hold *ONE* shared spinlock at a time.  This restriction is related
    to the algorithm that results in the extremely low overhead.

    In anycase, the overhead is *SO* low that I have no qualms whatsoever
    in using spinlocks in critical code paths.  Even so, not all DragonFly
    structures will need to be locked.  Some of the most critical structures
    will not need spinlocks due to the DragonFly design.

    Here's a partial list of structures.

    Structure				Spinlock requirements

    scheduler(s)			none
    thread (core thread)		none
    lwp (light weight process)		none
    proc (governing process)		NEEDED (possibly only if threaded)

    ucred				Minimal
    filedesc   (process file table)	NEEDED
    file       (file pointer)		NEEDED
    vnode				NEEDED
    namecache				NEEDED

    sockbuf				NEEDED - for read/write interlock only
    socket				none
    route/arp table			none
    network protocols			none

    sfbuf/msfbuf			NEEDED
    vm_object				NEEDED
    vm_page				NEEDED
    pmap				NEEDED


    I have just committed some initial struct file and struct filedesc
    spinlock work and done some initial performance testing.  My results
    are encouraging!  Basic file descriptor lookups, fhold(), and fdrop()
    calls are now spinlocked and the additional overhead is not detectable
    in my buildworld tests.

    I am going to add spinlocks to some of the other structures that need
    them today.

    Then it comes down to locking up the subsystems by tracking down all
    uses of the various structures and adding spinlocks where appropriate.
    It doesn't look too difficult.  The VM system will be the hardest,
    but generally speaking the structures that were a real mess to lockup
    in FreeBSD are almost precisely the structures that don't need to be
    spinlocked at all in DragonFly (e.g. thread, scheduler, socket protos,
    etc).

				SPINLOCK RULES

    * Use LWKT tokens if you want a spinlock that survives a blocking
      condition.  The LWKT token code will release the spinlock when the
      thread switches out and reacquire it when the thread switches back
      in.  LWKT tokens use exclusive spinlocks.

    * A thread may only hold one shared spinlock at a time.  
      AKA spin_lock_rd().

    * A thread may hold any number of exclusive spinlocks at a time.
      AKA spin_lock_wr().  But watch out for deadlock situations.

    * Spinlocks may not be held across blocking conditions and should not
      generally be held across complex procedure calls.  They are meant for
      structural field access.  If you need a more comprehensive lock, use
      a lockmgr lock rather then a spinlock.

      (exception: LWKT tokens can of course survive a blocking condition).

    * Any held spinlocks will prevent interrupt thread preemption from
      occuring with normal interrupts.  FAST interrupts and IPI functions
      are not effected.  Holding a spinlock is different from holding a
      critical section.  A critical section will prevent all interrupts 
      from occuring, including clock interrupts.

    * FAST interrupts and/or IPI functions should generally not try to obtain
      a spinlock as this can result in a deadlock.  In DragonFly, these
      functions almost unversally operate ONLY on the local cpu and are
      interlocked ONLY with critical sections, not spinlocks.  Spinlocks
      that can be obtained by FAST ints or IPI functions should always only
      be obtained with a critical section held.

						-Matt






More information about the Kernel mailing list