panic: cleaned vnode isn't

Sun Feb 7 00:41:48 PST 2010

On Fri, Feb 05, 2010 at 04:43:59PM -0800, Matthew Dillon wrote:
>     Please try this patch:
> 
> 	fetch http://apollo.backplane.com/DFlyMisc/lock01.patch
> 
>     I don't know if this will fix it or not.  There is an issue in
>     allocfreevnode() where a vnode whos v_lock.lk_flags sets
>     LK_CANRECURSE can be improperly reallocated while in the middle
>     of being freed, but only if the filesystem's VOP_RECLAIM code
>     recurses.

This didn't fix it. There was a new crash this night, possibly during the
daily maintenance window at 3am.

>     So the only way I can think of for this crash to occur is if UFS
>     recurses in softupdates and allocates new vnodes while reclaiming
>     a vnode, the allocate code then reusing a HAMMER vnode and reclaiming
>     IT, and HAMMER then recursing and trying to allocate a new vnode
>     itself and winding up reusing the vnode UFS was originally trying to
>     reclaim.  A difficult path to say the least.

Only /boot is UFS on this machine and doesn't use softupdates.

>     Both your crash dump and the one I got from leaf today crashed on
>     a HAMMER vnode being reallocated with a seemingly impossible state.
>     Clearly a MP race, but I couldn't find a smoking gun related to
>     HAMMER itself.  Basically vp->v_mount was NULL, the vnode was in
>     a reclaimed state, but vp->v_data was still pointing at the
>     HAMMER inode and the HAMMER inode was still pointing back at the
>     vp.  That implies the vnode was reallocated back to the same
>     HAMMER inode recursively from within the VOP_RECLAIM itself,
>     which shouldn't be possible.

Most of the crashes I could see occured during a pkgsrc distfile extraction,
just after I did a pkgsrc cvs update.

I've put the new core dump online.

-- 
Francois Tigeot