Dragonfly under KVM
Gary Allan
dragonfly at gallan.co.uk
Fri Jun 6 15:34:40 PDT 2008
Gary Allan wrote:
Hello,
I've been experiencing lock-ups using DragonFly HEAD SMP under kvm.
Running "make -j8 buildworld" triggers a completely unresponsive state
and 100.00% CPU usage on all four cores (Seen from host OS).
I've managed to get gdb attached and get some information.
The kernel is getting caught in a while loop in lwkt_acquire. I can
reliably trigger this with with a "make -j8 buildworld" under a SMP
kernel (Otherwise identical to GENERIC, no optimisations.) The OS is
completely unresponsive and all four cpu cores are running at 100%.
I've included the debug information.
Program received signal SIGINT, Interrupt.
lwkt_acquire (td=0xc6a59e70) at /usr/src/sys/kern/lwkt_thread.c:1048
1048 while (td->td_flags & (TDF_RUNNING|TDF_PREEMPT_LOCK))
(gdb) l
1043 mygd = mycpu;
1044 if (gd != mycpu) {
1045 cpu_lfence();
1046 KKASSERT((td->td_flags & TDF_RUNQ) == 0);
1047 crit_enter_gd(mygd);
1048 while (td->td_flags & (TDF_RUNNING|TDF_PREEMPT_LOCK))
1049 cpu_lfence();
1050 td->td_gd = mygd;
1051 TAILQ_INSERT_TAIL(&mygd->gd_tdallq, td, td_allq);
1052 td->td_flags &= ~TDF_MIGRATING;
(gdb) p td->td_flags
$1 = 8390177
(gdb) p td
$2 = (thread_t) 0xc6a59e70
(gdb) bt
#0 lwkt_acquire (td=0xc6a59e70) at /usr/src/sys/kern/lwkt_thread.c:1048
#1 0xc02c66af in bsd4_select_curproc (gd=0xff800000) at
/usr/src/sys/kern/usched_bsd4.c:358
#2 0xc02c6829 in bsd4_release_curproc (lp=0xea634c00) at
/usr/src/sys/kern/usched_bsd4.c:322
#3 0xc04b8239 in passive_release (td=0xdfe8aba0) at
/usr/src/sys/platform/pc32/i386/trap.c:212
#4 0xc02c870b in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:491
#5 0xc02c8b3b in lwkt_mp_lock_contested () at
/usr/src/sys/kern/lwkt_thread.c:1374
#6 0xc04b0751 in get_mplock () at
/usr/src/sys/platform/pc32/i386/mplock.s:168
#7 0xe9ef6d34 in ?? ()
#8 0xc04b94a4 in syscall2 (frame=0xe9ef6d40) at
/usr/src/sys/platform/pc32/i386/trap.c:1371
#9 0xc04a3396 in Xint0x80_syscall () at
/usr/src/sys/platform/pc32/i386/exception.s:876
#10 0xe9ef6d40 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) jump 1050
Continuing at 0xc02c8bbb.
Continuing execution does not appear to cause any problems.
I can provide additional debugging info if required but I'm unsure of
how to proceed with this myself.
Regards
Gary
More information about the Bugs
mailing list