[issue1996] panic: assertion: p->p_lock == 0 in kern_wait

YONETANI Tomokazu y0n3t4n1 at gmail.com
Mon Jun 6 09:54:27 PDT 2011


On Mon, Jun 06, 2011 at 05:33:02AM +0000, Venkatesh Srinivas (via DragonFly issue tracker) wrote:
> 
> Venkatesh Srinivas <vsrinivas at dragonflybsd.org> added the comment:
> 
> Hi,
> 
> I just saw a patch, 49aa3df0ca3e226c0a0d7097863a2426ee6fd534, go in to fix this
> issue; it adds:

Actually the real fix was 4a7e6f553, which removed the KKASSERT()
right before calling proc_remove_zombie().  I forgot mentioning
the issue1996 in the commit log, though.

> 
> 
> +
> +                       /*
> +                        * Temporary refs may still have been acquired while
> +                        * we removed the process, make sure they are all
> +                        * gone before kfree()ing.  Now that the process has
> +                        * been removed from all lists and all references to
> +                        * it have gone away, no new refs can occur.
> +                        */
> +                       while (p->p_lock)
> +                               tsleep(p, 0, "reap4", hz);
>                         kfree(p, M_PROC);
> 
> First, is anything required to ensure that p->p_lock is really loaded each loop
> iteration? Is the compiler allowed to optimize away the load after the first loop?

This is not an answer to your question, but there are similar wait loops
all over the kernel without `volatile' keywords on the pointer or the
struct member, so if such an optimization is actually allowed, the kernel
breaks in many ways.

> Second, I don't understand how this is safe; the problem here is that another
> code path obtained a reference to this process and was using it when the kfree()
> happened. What prevents this?

But for such a thing to happen, B needs to obtain the address of p before
it's completely removed from the lists held in other processes, lwp's,
or td's, and wait until A reaches kfree() below.  By the time A reaches
here, the process is (supposed to be) removed from all the lists, so B
can't find it right before calling PHOLD().  It may be possible if B
is using a cached pointer somewhere, but I haven't found such a code (yet),
at least not without holding tokens.


> A                                   B
> ...
> vm_waitproc(p)
> 
> while(p->p_lock)
>    tsleep(...)
>                                     /* get reference to process */
>                                     PHOLD(p)
> kfree(p)
>                                     /* HEY! */
> 
> Thanks,
> -- vs
> 
> _____________________________________________________
> DragonFly issue tracker <bugs at lists.dragonflybsd.org>
> <http://bugs.dragonflybsd.org/issue1996>
> _____________________________________________________





More information about the Bugs mailing list