git: kernel - fine-grained namecache and partial vnode MPSAFE work

Matthew Dillon dillon at crater.dragonflybsd.org
Sun Dec 27 23:10:46 PST 2009


commit 2247fe02f4e80c2f2acaa71e60bf6b98eb848dca
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date:   Sun Dec 27 22:36:07 2009 -0800

    kernel - fine-grained namecache and partial vnode MPSAFE work
    
    			Namecache subsystem
    
    * All vnode->v_flag modifications now use vsetflags() and vclrflags().
      Because some flags are set and cleared by vhold()/vdrop() which
      do not require any locks to be held, all modifications must use atomic
      ops.
    
    * Clean up and revamp the namecache MPSAFE work.  Namecache operations now
      use a fine-grained MPSAFE locking model which loosely follows these
      rules:
    
      - lock ordering is child to parent.  e.g. lock file, then lock parent
        directory.  This allows resolver recursions up the parent directory
        chain.
    
      - Downward-traversing namecache invalidations and path lookups will
        unlock the parent (but leave it referenced) before attempting to
        lock the child.
    
      - Namecache hash table lookups utilize a per-bucket spinlock.
    
      - vnode locks may be acquired while holding namecache locks but not
        vise-versa.  VNodes are not destroyed until all namecache references
        go away, but can enter reclamation.  Namecache lookups detect the case
        and re-resolve to overcome the race.  Namecache entries are not
        destroyed while referenced.
    
    * Remove vfs_token, the namecache MPSAFE model is now totally fine-grained.
    
    * Revamp namecache locking primitves (cache_lock/cache_unlock and
      friends).  Use atomic ops and nc_exlocks instead of nc_locktd and
      build-in a request flag.  This solves busy/tsleep races between lock
      holder and lock requester.
    
    * Revamp namecache parent/child linkages.  Instead of using vfs_token to
      lock such operations we simply lock both child and parent namecache
      entries.  Hash table operations are also fully integrated with the
      parent/child linking operations.
    
    * The vnode->v_namecache list is locked via vnode->v_spinlock, which
      is actually vnode->v_lock.lk_spinlock.
    
    * Revamp cache_vref() and cache_vget().  The passed namecache entry must
      be referenced and locked.  Internals are simplified.
    
    * Fix a deadlock by moving the call to _cache_hysteresis() to a
      place where the current thread otherwise does not hold any locked
      ncp's.
    
    * Revamp nlookup() to follow the new namecache locking rules.
    
    * Fix a number of places, e.g. in vfs/nfs/nfs_subs.c, where ncp->nc_parent
      or ncp->nc_vp was being accessed with an unlocked ncp.  nc_parent
      and nc_vp accesses are only valid if the ncp is locked.
    
    * Add the vfs.cache_mpsafe sysctl, which defaults to 0.  This may be set
      to 1 to enable MPSAFE namecache operations for [l,f]stat() and open()
      system calls (for the moment).
    
    			VFS/VNODE subsystem
    
    * Use a global spinlock for now called vfs_spin to manage vnode_free_list.
      Use vnode->v_spinlock (and vfs_spin) to manage vhold/vdrop ops and
      to interlock v_auxrefs tests against vnode terminations.
    
    * Integrate per-mount mnt_token and (for now) the MP lock into VOP_*()
      and VFS_*() operations.  This allows the MP lock to be shifted further
      inward from the system calls, but we don't do it quite yet.
    
    * HAMMER: VOP_GETATTR, VOP_READ, and VOP_INACTIVE are now MPSAFE.  The
      corresponding sysctls have been removed.
    
    * FIFOFS: Needed some MPSAFE work in order to allow HAMMER to make things
      MPSAFE above, since HAMMER forwards vops for in-filesystem fifos to
      fifofs.
    
    * Add some debugging kprintf()s when certain MP races are averted, for
      testing only.
    
    				MISC
    
    * Add some assertions to the VM system.
    
    * Document existing and newly MPSAFE code.

Summary of changes:
 .../linux/i386/linprocfs/linprocfs_subr.c          |    2 +-
 sys/emulation/linux/linux_misc.c                   |    2 +-
 sys/kern/imgact_aout.c                             |    2 +-
 sys/kern/imgact_elf.c                              |    2 +-
 sys/kern/kern_checkpoint.c                         |    2 +-
 sys/kern/kern_iosched.c                            |    9 +
 sys/kern/kern_lockf.c                              |    4 +-
 sys/kern/vfs_cache.c                               |  802 +++++++++++++-------
 sys/kern/vfs_conf.c                                |    6 +-
 sys/kern/vfs_lock.c                                |  152 +++-
 sys/kern/vfs_mount.c                               |    2 +
 sys/kern/vfs_nlookup.c                             |   84 ++-
 sys/kern/vfs_subr.c                                |   28 +-
 sys/kern/vfs_sync.c                                |    4 +-
 sys/kern/vfs_syscalls.c                            |   66 +-
 sys/kern/vfs_vfsops.c                              |   49 +-
 sys/kern/vfs_vnops.c                               |   58 +-
 sys/kern/vfs_vopops.c                              |  131 ++--
 sys/platform/pc32/i386/pmap.c                      |    6 +-
 sys/sys/mount.h                                    |   42 +-
 sys/sys/namecache.h                                |   63 ++-
 sys/vfs/devfs/devfs_core.c                         |    3 +-
 sys/vfs/devfs/devfs_vnops.c                        |    2 +-
 sys/vfs/fdesc/fdesc_vfsops.c                       |    2 +-
 sys/vfs/fifofs/fifo_vnops.c                        |    2 +-
 sys/vfs/gnu/ext2fs/ext2_quota.c                    |    4 +-
 sys/vfs/hammer/hammer.h                            |    1 +
 sys/vfs/hammer/hammer_inode.c                      |   12 +-
 sys/vfs/hammer/hammer_vfsops.c                     |    3 +-
 sys/vfs/hpfs/hpfs_vfsops.c                         |    2 +-
 sys/vfs/isofs/cd9660/cd9660_vfsops.c               |    2 +-
 sys/vfs/msdosfs/msdosfs_denode.c                   |    2 +-
 sys/vfs/nfs/nfs_subs.c                             |    3 +
 sys/vfs/nfs/nfs_vfsops.c                           |    4 +-
 sys/vfs/ntfs/ntfs_vfsops.c                         |    4 +-
 sys/vfs/nwfs/nwfs_vfsops.c                         |    2 +-
 sys/vfs/portal/portal_vfsops.c                     |    2 +-
 sys/vfs/smbfs/smbfs_vfsops.c                       |    2 +-
 sys/vfs/udf/udf_vfsops.c                           |    2 +-
 sys/vfs/ufs/ufs_quota.c                            |    4 +-
 sys/vfs/ufs/ufs_vnops.c                            |    2 +-
 sys/vfs/union/union_subr.c                         |    2 +-
 sys/vm/vm_object.c                                 |    2 +-
 sys/vm/vnode_pager.c                               |   10 +-
 44 files changed, 1007 insertions(+), 583 deletions(-)

http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/2247fe02f4e80c2f2acaa71e60bf6b98eb848dca


-- 
DragonFly BSD source repository





More information about the Commits mailing list