Working on vnode sequencing and locking - HEAD will destabilize for a little bit

Matthew Dillon dillon at
Wed Aug 9 15:08:28 PDT 2006

    I'm working on the vnode ref counting and locking, HEAD may destabilize
    for a bit.  There are a couple of things going on here, all related
    to userland VFS interfacing.

    First, I am distinguishing between ref'ing a vnode that already has 
    refs (which is just one atomic add instruction), vs ref'ing a vnode
    that has 0 refs (which requires interactions with the recycling code).
    If I got the separation wrong the kernel will assert.  This commit is
    going in today.

    Secondly, I am going to start using the vnode's spinlock to control
    access to the vnode ref count.  That commit will go in tomorrow.
    Its an easy thing to do and it helps with the MP locking work.

    Third, the vnode locking code is getting an overhaul.  VOP_LOCK
    and VOP_UNLOCK are going to be removed - vnode locking will become
    native to the vnode and not run through VOP operations any more.   Most
    of the groundwork for this has already been done.

    This relates to userland VFS and clustering.   Because VOP operations
    may run over a communications stream, VOP_LOCK and VOP_UNLOCK simply
    do not fit into the scheme any more.

    Finally, the vnode locking will be moved out of the kernel layer and
    into the filesystem layer.  This is again for userland VFS and clustering.
    It will mean that we do not have to hold a vnode lock across a filesystem
    operation... we've already seen what that can lead to with NFS when an
    NFS server goes down.

    So instead of this:

	lock vnode
	VOP_READ	------->
				[userland VFS / syslink communications layer]
				do operation
	unlock vnode	<------

    We will have this:

	VOP_READ	------->
				[userland VFS / syslink communications layer]
				lock local representation of vnode
				do operation
				unlock local representation of vnode

    And, poof, no more indefinite blocking states in the kernel for NFS or
    for the upcoming userland VFS or clustering.

    There will be fine-grained range locks to maintain UNIX atomicy
    requirements, but since they aren't going to be hard locks they won't
    prevent basic things like ^C from working properly.

					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>

More information about the Kernel mailing list