git: kernel - fine-grained namecache and partial vnode MPSAFE work
Matthew Dillon
dillon at crater.dragonflybsd.org
Sun Dec 27 23:10:46 PST 2009
commit 2247fe02f4e80c2f2acaa71e60bf6b98eb848dca
Author: Matthew Dillon <dillon at apollo.backplane.com>
Date: Sun Dec 27 22:36:07 2009 -0800
kernel - fine-grained namecache and partial vnode MPSAFE work
Namecache subsystem
* All vnode->v_flag modifications now use vsetflags() and vclrflags().
Because some flags are set and cleared by vhold()/vdrop() which
do not require any locks to be held, all modifications must use atomic
ops.
* Clean up and revamp the namecache MPSAFE work. Namecache operations now
use a fine-grained MPSAFE locking model which loosely follows these
rules:
- lock ordering is child to parent. e.g. lock file, then lock parent
directory. This allows resolver recursions up the parent directory
chain.
- Downward-traversing namecache invalidations and path lookups will
unlock the parent (but leave it referenced) before attempting to
lock the child.
- Namecache hash table lookups utilize a per-bucket spinlock.
- vnode locks may be acquired while holding namecache locks but not
vise-versa. VNodes are not destroyed until all namecache references
go away, but can enter reclamation. Namecache lookups detect the case
and re-resolve to overcome the race. Namecache entries are not
destroyed while referenced.
* Remove vfs_token, the namecache MPSAFE model is now totally fine-grained.
* Revamp namecache locking primitves (cache_lock/cache_unlock and
friends). Use atomic ops and nc_exlocks instead of nc_locktd and
build-in a request flag. This solves busy/tsleep races between lock
holder and lock requester.
* Revamp namecache parent/child linkages. Instead of using vfs_token to
lock such operations we simply lock both child and parent namecache
entries. Hash table operations are also fully integrated with the
parent/child linking operations.
* The vnode->v_namecache list is locked via vnode->v_spinlock, which
is actually vnode->v_lock.lk_spinlock.
* Revamp cache_vref() and cache_vget(). The passed namecache entry must
be referenced and locked. Internals are simplified.
* Fix a deadlock by moving the call to _cache_hysteresis() to a
place where the current thread otherwise does not hold any locked
ncp's.
* Revamp nlookup() to follow the new namecache locking rules.
* Fix a number of places, e.g. in vfs/nfs/nfs_subs.c, where ncp->nc_parent
or ncp->nc_vp was being accessed with an unlocked ncp. nc_parent
and nc_vp accesses are only valid if the ncp is locked.
* Add the vfs.cache_mpsafe sysctl, which defaults to 0. This may be set
to 1 to enable MPSAFE namecache operations for [l,f]stat() and open()
system calls (for the moment).
VFS/VNODE subsystem
* Use a global spinlock for now called vfs_spin to manage vnode_free_list.
Use vnode->v_spinlock (and vfs_spin) to manage vhold/vdrop ops and
to interlock v_auxrefs tests against vnode terminations.
* Integrate per-mount mnt_token and (for now) the MP lock into VOP_*()
and VFS_*() operations. This allows the MP lock to be shifted further
inward from the system calls, but we don't do it quite yet.
* HAMMER: VOP_GETATTR, VOP_READ, and VOP_INACTIVE are now MPSAFE. The
corresponding sysctls have been removed.
* FIFOFS: Needed some MPSAFE work in order to allow HAMMER to make things
MPSAFE above, since HAMMER forwards vops for in-filesystem fifos to
fifofs.
* Add some debugging kprintf()s when certain MP races are averted, for
testing only.
MISC
* Add some assertions to the VM system.
* Document existing and newly MPSAFE code.
Summary of changes:
.../linux/i386/linprocfs/linprocfs_subr.c | 2 +-
sys/emulation/linux/linux_misc.c | 2 +-
sys/kern/imgact_aout.c | 2 +-
sys/kern/imgact_elf.c | 2 +-
sys/kern/kern_checkpoint.c | 2 +-
sys/kern/kern_iosched.c | 9 +
sys/kern/kern_lockf.c | 4 +-
sys/kern/vfs_cache.c | 802 +++++++++++++-------
sys/kern/vfs_conf.c | 6 +-
sys/kern/vfs_lock.c | 152 +++-
sys/kern/vfs_mount.c | 2 +
sys/kern/vfs_nlookup.c | 84 ++-
sys/kern/vfs_subr.c | 28 +-
sys/kern/vfs_sync.c | 4 +-
sys/kern/vfs_syscalls.c | 66 +-
sys/kern/vfs_vfsops.c | 49 +-
sys/kern/vfs_vnops.c | 58 +-
sys/kern/vfs_vopops.c | 131 ++--
sys/platform/pc32/i386/pmap.c | 6 +-
sys/sys/mount.h | 42 +-
sys/sys/namecache.h | 63 ++-
sys/vfs/devfs/devfs_core.c | 3 +-
sys/vfs/devfs/devfs_vnops.c | 2 +-
sys/vfs/fdesc/fdesc_vfsops.c | 2 +-
sys/vfs/fifofs/fifo_vnops.c | 2 +-
sys/vfs/gnu/ext2fs/ext2_quota.c | 4 +-
sys/vfs/hammer/hammer.h | 1 +
sys/vfs/hammer/hammer_inode.c | 12 +-
sys/vfs/hammer/hammer_vfsops.c | 3 +-
sys/vfs/hpfs/hpfs_vfsops.c | 2 +-
sys/vfs/isofs/cd9660/cd9660_vfsops.c | 2 +-
sys/vfs/msdosfs/msdosfs_denode.c | 2 +-
sys/vfs/nfs/nfs_subs.c | 3 +
sys/vfs/nfs/nfs_vfsops.c | 4 +-
sys/vfs/ntfs/ntfs_vfsops.c | 4 +-
sys/vfs/nwfs/nwfs_vfsops.c | 2 +-
sys/vfs/portal/portal_vfsops.c | 2 +-
sys/vfs/smbfs/smbfs_vfsops.c | 2 +-
sys/vfs/udf/udf_vfsops.c | 2 +-
sys/vfs/ufs/ufs_quota.c | 4 +-
sys/vfs/ufs/ufs_vnops.c | 2 +-
sys/vfs/union/union_subr.c | 2 +-
sys/vm/vm_object.c | 2 +-
sys/vm/vnode_pager.c | 10 +-
44 files changed, 1007 insertions(+), 583 deletions(-)
http://gitweb.dragonflybsd.org/dragonfly.git/commitdiff/2247fe02f4e80c2f2acaa71e60bf6b98eb848dca
--
DragonFly BSD source repository
More information about the Commits
mailing list