git: kernel - Major MPSAFE Infrastructure

Matthew Dillon dillon at crater.dragonflybsd.org
Fri Aug 27 17:24:03 PDT 2010


commit 77912481ac5f5d886b07c9f7038b03eba09b2bca
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date:   Thu Aug 26 21:18:06 2010 -0700

    kernel - Major MPSAFE Infrastructure
    
    * vm_page_lookup() now requires the vm_token to be held on call instead of
      the MP lock.  And fix the few places where the routine was being called
      without the vm_token.
    
      Various situations where a vm_page_lookup() is performed followed by
      vm_page_wire(), without busying the page, and other similar situations,
      require the vm_token to be held across the whole block of code.
    
    * bio_done callbacks are now MPSAFE but some drivers (ata, ccd, vinum,
      aio, nfs) are not MPSAFE yet so get the mplock for those.   They will
      be converted to a generic driver-wide token later.
    
    * Remove critical sections that used to protect VM system related
      interrupts, replace with the vm_token.
    
    * Spinlocks now bump thread->td_critcount in addition to
      mycpu->gd_spinlock*.  Note the ordering is important.  Then remove
      gd_spinlock* checks elsewhere that are covered by td_critcount and
      replace with assertions.
    
      Also use td_critcount in the kern_mutex.c code instead of gd_spinlock*.
    
      This fixes situations where the last crit_exit() would call splx()
      without checking for spinlocks.  Adding the additional checks would
      have made the crit_*() inlines too complex so instead we just fold
      it into td_critcount.
    
    * lwkt_yield() no longer guarantees that lwkt_switch() will be called
      so call lwkt_switch() instead in places where a switch is required.
      For example, to unwind a preemption.  Otherwise the kernel could end
      up live-locking trying to yield because the new switch code does not
      necessarily schedule a different kernel thread.
    
    * Add the sysctl user_pri_sched (default 0).  Setting this will make
      the LWKT scheduler more aggressively schedule user threads when
      runnable kernel threads are unable to gain token/mplock resources.
      For debugging only.
    
    * Change the bufspin spinlock to bufqspin and bufcspin, and generally
      rework vfs_bio.c to lock numerous fields with bufcspin.  Also use
      bufcspin to interlock waitrunningbufspace() and friends.
    
      Remove several mplocks in vfs_bio.c that are no longer needed.
    
      Protect the page manipulation code in vfs_bio.c with vm_token instead
      of the mplock.
    
    * Fix a deadlock with the FINDBLK_TEST/BUF_LOCK sequence which can occur
      due to the fact that the buffer may change its (vp,loffset) during
      the BUF_LOCK call.  Even though the code checks for this after
      the lock succeeds there is still the problem of the locking operation
      itself potentially creating a deadlock betwen two threads by locking
      an unexpected buffer when the caller is already holding other buffers
      locked.
    
      We do this by adding an interlock refcounter, b_refs.  getnewbuf()
      will avoid reusing such buffers.
    
    * The syncer_token was not protecting all accesses to the syncer list.
      Fix that.
    
    * Make HAMMER MPSAFE.  All major entry points now use a per-mount token,
      hmp->fs_token.  Backend callbacks (bioops, bio_done) use hmp->io_token.
      The cache-case for the read and getattr paths require not tokens at
      all (as before).
    
      The bitfield flags had to be separated into two groups to deal with
      SMP cache coherency races.
    
      Certain flags in the hammer_record structure had to be separated for
      the same reason.
    
      Certain interactions between the frontend and the backend must use
      the hmp->io_token.
    
      It is important to note that for any given buffer there are two
      locking entities: (1) The hammer structure and (2) The buffer cache
      buffer.  These interactions are very fragile.
    
      Do not allow the kernel to flush a dirty buffer if we are unable
      to obtain a norefs-interlock on the buffer, which fixes numerous
      frontend/backend MP races on the io structure.
    
      Add a write interlock in one of the recover_flush_buffer cases.

Summary of changes:
 sys/dev/agp/agp.c                   |    4 +
 sys/dev/agp/agp_i810.c              |    5 +-
 sys/dev/disk/ata/ata-raid.c         |    4 +
 sys/dev/disk/ccd/ccd.c              |    8 +-
 sys/dev/raid/vinum/vinumhdr.h       |    1 +
 sys/dev/raid/vinum/vinuminterrupt.c |    6 +
 sys/kern/kern_exec.c                |    9 +-
 sys/kern/kern_mutex.c               |   41 ++-
 sys/kern/kern_slaballoc.c           |   12 +-
 sys/kern/kern_spinlock.c            |    2 +
 sys/kern/lwkt_thread.c              |   23 +-
 sys/kern/uipc_syscalls.c            |    4 +
 sys/kern/usched_bsd4.c              |   28 +-
 sys/kern/vfs_aio.c                  |    2 +
 sys/kern/vfs_bio.c                  |  498 +++++++++++++++++++++--------------
 sys/kern/vfs_cluster.c              |    6 +-
 sys/kern/vfs_subr.c                 |   16 +-
 sys/kern/vfs_sync.c                 |   62 +++--
 sys/platform/pc32/isa/clock.c       |   10 +-
 sys/platform/pc64/isa/clock.c       |   10 +-
 sys/sys/bio.h                       |    2 +-
 sys/sys/buf.h                       |    6 +-
 sys/sys/spinlock2.h                 |   47 +---
 sys/sys/vnode.h                     |    3 +-
 sys/vfs/devfs/devfs_vnops.c         |   11 +-
 sys/vfs/hammer/hammer.h             |   33 ++-
 sys/vfs/hammer/hammer_flusher.c     |    3 +
 sys/vfs/hammer/hammer_io.c          |  177 ++++++++++---
 sys/vfs/hammer/hammer_object.c      |    6 +-
 sys/vfs/hammer/hammer_ondisk.c      |    9 +-
 sys/vfs/hammer/hammer_recover.c     |    9 +
 sys/vfs/hammer/hammer_volume.c      |    6 +
 sys/vfs/nfs/nfs_bio.c               |   10 +
 sys/vm/swap_pager.c                 |    4 +-
 sys/vm/vm_page.c                    |    8 +-
 35 files changed, 694 insertions(+), 391 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/77912481ac5f5d886b07c9f7038b03eba09b2bca


-- 
DragonFly BSD source repository





More information about the Commits mailing list