panic: assertion: dd->uschedcp != lp in bsd4_resetpriority

Thu May 29 10:29:13 PDT 2008

:Hi.
:
:I'm seeing this panic several times recently, only on an SMP kernel
:running on VMware Fusion, with dual CPU support (not when the
:number of active processors available to guest OS is 1).  It occurs
:under load, for instance when building kernel or world.  I don't remember
:when this started, so it may or may not be VMware's problem.  I need to
:try it on a real SMP hardware.  Anyway, ...
:...
:
:At first I thought that other CPU has just modified after this CPU
:has unlocked bsd4_spin but before KKASSERT(), so I tried to defer
:spin_unlock_wr() as done in bsd4_setrunqueue():
:...
:This seemed to cease the assertion, but adding debugging stuff
:(like kprintf() or mycpu) also seemed to avoid the assertion
:(or made it diffcult to reproduce), so I'm not 100% sure this
:is enough, but other places manipulating bsd4_pcpu[] include:
:
:  bsd4_acquire_curproc:252: only when dd->uschedcp == NULL
:  bsd4_release_curproc:321: only when dd->uschedcp == lp
:  bsd4_setrunqueue:463:     only when gd == mycpu
:  bsd4_schedulerclock:581:  only modifier of rrcount
:
:which don't seem to need spinlocks (maybe, correct me if I'm wrong).
:
:Cheers.

    I think you basically found the bug, and it does make sense that
    kprintf's could disrupt the race that causes it because what you found
    is a true 'SMP' race between two cpus.

    Almost anything related to the globals (around line 146) is supposed to
    be spin-locked.  Numerous elements related to the per-cpu data
    (bsd4_pcpu[]), when operating on the current cpu, should not require
    spin locking.  That is the idea anyway.

    Setting and NULLing dd->uschedcp in the per-cpu data, for the current
    cpu, is not intended to need a spin lock.  Similarly the per-cpu
    rrcount field is a heuristic only used by the current cpu and does not
    need a spin lock.

    Scanning the global scheduler queue is intended to require a spinlock,
    since all the cpus are 'pulling' from the same global queue.

    The code you found the bug appears to be bsd4_resetpriority() when
    called on a 'random' lwp instead of the 'current' lwp on the calling
    cpu.   That is, when the softclock does the allproc_scan().  That
    scan only occurs on one cpu, with the MP lock held (I think), but can
    race against manipulation of the per-cpu scheduler data (dd->*) on other
    cpus.  Such manipulate can occur on other cpus without the MP lock held
    and without the spinlock held.

    Because the LWP in the failing path (the one doing the allproc_scan) is
    potentially owned by some other cpu, that KKASSERT() can race another
    cpu and cause a panic.  The bug is actually the presence of the KKASSERT
    and not the manipulation of the spin lock.

    Moving the spinlock thus might not completely solve the problem.

    I think all that needs to be done here is to remove the KKASSERT().  The
    worst that happens is that a cpu gets an extra need_resched IPI (which
    does no harm), or dd->upri winds up with the wrong value, which should
    also do no real harm.  I believe the KKASSERT itself is wrong.  Instead,
    augment the code comment above that section to indicate that the
    dd_uschedcp variable can be ripped out from under the code and
    cause a spurious need_user_resched() on the target cpu and caused
    dd->upri to be wrong for a short period of time.

					-Matt
					Matthew Dillon 
					<dillon at backplane.com>