panic: assertion: dd->uschedcp != lp in bsd4_resetpriority

YONETANI Tomokazu qhwt+dfly at les.ath.cx
Thu May 29 04:22:48 PDT 2008


Hi.

I'm seeing this panic several times recently, only on an SMP kernel
running on VMware Fusion, with dual CPU support (not when the
number of active processors available to guest OS is 1).  It occurs
under load, for instance when building kernel or world.  I don't remember
when this started, so it may or may not be VMware's problem.  I need to
try it on a real SMP hardware.  Anyway, ...

(kgdb) bt
#0  dumpsys () at ./machine/thread.h:83
#1  0xc02bc881 in boot (howto=256)
    at /dfsrc/current/sys/kern/kern_shutdown.c:375
#2  0xc02bcb44 in panic (fmt=0xc0507fb7 "assertion: %s in %s")
    at /dfsrc/current/sys/kern/kern_shutdown.c:800
#3  0xc02c2515 in bsd4_resetpriority (lp=0xcadf8900)
    at /dfsrc/current/sys/kern/usched_bsd4.c:792
#4  0xc02c26f8 in bsd4_recalculate_estcpu (lp=0xcadf8900)
    at /dfsrc/current/sys/kern/usched_bsd4.c:701
#5  0xc02ca0ef in schedcpu_stats (p=0xcae6f858, data=0x0)
    at /dfsrc/current/sys/kern/kern_synch.c:212
#6  0xc02b68e6 in allproc_scan (callback=0xc02ca099 <schedcpu_stats>, data=0x0)
    at /dfsrc/current/sys/kern/kern_proc.c:533
#7  0xc02c9d52 in schedcpu (arg=0x0)
    at /dfsrc/current/sys/kern/kern_synch.c:186
#8  0xc02cce3a in softclock_handler (arg=0xc060c7e0)
    at /dfsrc/current/sys/kern/kern_timeout.c:308
#9  0xc02c450b in lwkt_deschedule_self (td=Cannot access memory at address 0x8
)
    at /dfsrc/current/sys/kern/lwkt_thread.c:223

At first I thought that other CPU has just modified after this CPU
has unlocked bsd4_spin but before KKASSERT(), so I tried to defer
spin_unlock_wr() as done in bsd4_setrunqueue():

%%%
diff --git a/sys/kern/usched_bsd4.c b/sys/kern/usched_bsd4.c
index b934e3d..e3478a0 100644
--- a/sys/kern/usched_bsd4.c
+++ b/sys/kern/usched_bsd4.c
@@ -779,7 +779,6 @@ bsd4_resetpriority(struct lwp *lp)
 		lp->lwp_priority = newpriority;
 		reschedcpu = -1;
 	}
-	spin_unlock_wr(&bsd4_spin);
 
 	/*
 	 * Determine if we need to reschedule the target cpu.  This only
@@ -789,9 +788,14 @@ bsd4_resetpriority(struct lwp *lp)
 	 */
 	if (reschedcpu >= 0) {
 		dd = &bsd4_pcpu[reschedcpu];
-		KKASSERT(dd->uschedcp != lp);
+		if (dd->uschedcp == lp) {
+			kprintf("%p(%d): dd->uschedcp=lp=%p\n",
+				curthread, mycpu->gd_cpuid, lp);
+			goto out;
+		}
 		if ((dd->upri & ~PRIMASK) > (lp->lwp_priority & ~PRIMASK)) {
 			dd->upri = lp->lwp_priority;
+			spin_unlock_wr(&bsd4_spin);
 #ifdef SMP
 			if (reschedcpu == mycpu->gd_cpuid) {
 				need_user_resched();
@@ -802,8 +806,12 @@ bsd4_resetpriority(struct lwp *lp)
 #else
 			need_user_resched();
 #endif
+			crit_exit();
+			return;
 		}
 	}
+out:
+	spin_unlock_wr(&bsd4_spin);
 	crit_exit();
 }
 
%%%

This seemed to cease the assertion, but adding debugging stuff
(like kprintf() or mycpu) also seemed to avoid the assertion
(or made it diffcult to reproduce), so I'm not 100% sure this
is enough, but other places manipulating bsd4_pcpu[] include:

  bsd4_acquire_curproc:252: only when dd->uschedcp == NULL
  bsd4_release_curproc:321: only when dd->uschedcp == lp
  bsd4_setrunqueue:463:     only when gd == mycpu
  bsd4_schedulerclock:581:  only modifier of rrcount

which don't seem to need spinlocks (maybe, correct me if I'm wrong).

Cheers.





More information about the Bugs mailing list