Problems with vnode locking/unlocking
Alex Hornung
ahornung at gmail.com
Tue May 26 23:57:00 PDT 2009
In the devfs_root->devfs_allocv issue I've decided to check inside
allocv to see if devfs_node->v_node is NULL and only THEN allocate a new
vnode, as it should be nulled out everry time a vnode is reclaimed.
Also I've taken into account your suggestions regarding locking and
yesterday's problems are gone (hang when reading the content of a
subdirectory) but I've found a new one.
Now I get a panic in lockmgr due to the vn_unlock in nresolve when
creating, or trying to create, a symlink. I suppose I'm trying to unlock
an unlocked vnode and that is causing the problem. Is it right after all
to unlock the vnode in nresolve? I don't think nsymlink has even been
called yet at the time of the panic.
Cheers,
Alex Hornung
The new code is at:
http://gitweb.dragonflybsd.org/~alexh/dragonfly.git/tree/5f72c8d7bd3081a5aeaec1707cf55de110a3385d:/sys/vfs/devfs
This is the panic:
kernel: type 12 trap, code=0
Stopped at lockmgr+0x8b: movl 0x4(%ebx),%ecx
db> trace
trace
lockmgr(c0,6) at lockmgr+0x8b
vn_unlock(0,0,d4484328,cf79c518,d4453af8) at vn_unlock+0x13
devfs_nresolve(d1f54b64,c0616760,c1e899f0,d4484328,cf79c518) at
devfs_nresolve+0xa5
vop_nresolve(c1e899f0,d1f54ba4,d4484328,c1d45f58,cf79c518) at
vop_nresolve+0x2f
cache_resolve(d1f54bd4,c1d45f58) at cache_resolve+0x30f
nlookup(d1f54c80,0,d1f54c80,d1f54cf0,d1f54cc4) at nlookup+0x25e
kern_stat(d1f54c80,d1f54c18,d1f54c34,c048f383,d1f067f4) at kern_stat+0xf
sys_lstat(d1f54cf0,6,0,0,cf79c2d8) at sys_lstat+0x35
syscall2(d1f54d40) at syscall2+0x1ef
Xint0x80_syscall() at Xint0x80_syscall+0x36
On Tue, 2009-05-26 at 16:02 -0700, Matthew Dillon wrote:
> :Hi,
> :
> :I'm running into quite a lot of trouble with vnode locking and
> :unlocking, releasing, putting, getting, ... I'm 100% sure I got most if
> :not all of the vnode locking wrong and I reallly need some help
> :understanding it. It is still unclear to me what vnops need to do what
> :with regard to locking/unlocking.
> :
> :My current code is here:
> :http://gitweb.dragonflybsd.org/~alexh/dragonfly.git/tree/9de1a66518d104077521a13d7b13ae958fd79d98:/sys/vfs/devfs
> :
> :Right now my concerns are mainly in devfs_vnops.c, where all the
> :locking/unlocking/... occurs or should occur.
> :
> :I'd appreciate some insight/comments/corrections/... on this issue!
> :
> :Alex Hornung
>
> One thing I noticed is that the devfs_root() code path does not
> look right. This routine can be called any number of times by
> the kernel, you don't want to allocate a new root vnode every
> time!
>
> devfs_root()->devfs_allocv()->getnewvnode()-> ...
>
> If the root node already has a vnode associated with it you have to
> ref and vn_lock and return that vnode, not allocate a new vnode.
> I think you might be overwriting previously acquired vnode pointers
> in that root node and that will certainly mess things up.
>
> --
>
> Another thing I noticed is that you need to remember that when you
> return a vnode in *vpp, the caller is expected that vnode to be
> referenced (and possibly also locked), which means you don't release
> the vnode that you are also returning unless you have extra references
> (2 or more) and you need to get them down to one reference for the
> return.
>
> So for example the devfs_nsymlink() call is calling devfs_allocvp()
> but it is also returning the vp in *ap->a_vpp. In the case of
> devfs_nsymlink() I believe it is expected to return a referenced
> but NOT locked vnode, so you would unlock it but not dereference it
> before returning.
>
> You will want to check all the VNOPS that return a vnode in *ap->a_vpp
> for similar issues.
>
> In the case if nresolve I believe you are doing it correctly...
> VOP_NRESOLVE() does not return a vnode (there is no ap->a_vpp),
> it just expects the namecache entry to be resolved and the
> cache_setvp() call doesn't inherit any refs or anything so you unlock
> and release the vnode like you are doing.
>
> -Matt
> Matthew Dillon
> <dillon at backplane.com>
More information about the Kernel
mailing list