[issue1325] ddb stops responding after resizing terminal window of vkernel
Matthew Dillon
dillon at apollo.backplane.com
Sat Mar 28 22:06:59 PDT 2009
:> I get the same freeze. It's looping in lwkt_send_ipiq3() inside this
:> while loop:
:>
:> while (ip->ip_windex - ip->ip_rindex > MAXCPUFIFO / 4) {
:> KKASSERT(ip->ip_windex - ip->ip_rindex != MAXCPUFIFO - 1);
:> lwkt_process_ipiq();
:> }
:>
:> I haven't yet tried Matt's debug patch. I don't get the freeze
:> immediately only after the 8th time that read() in vconsgetc() is
:> interrupted. I'm running in xterm.
:>
:> Joe
:
:In fact, I can reproduce it without resizing-terminal tricks:
: su root -c 'for i in `jot 8 1`; do pkill -WINCH kernel; sleep 0.1; done'
:
:Cheers.
Excellent! I added a print_backtrace() to that loop and reproduced
the lockup with the kill -WINCH / sleep! I found the problem!
lwkt_send_ipiq3(202,3,4340758b,0,434071e4) at 0x80c1b84
lwkt_send_ipiq3(40400000,8099f00,3,0,434003b0) at 0x80c1b84
sched_ithd(3,2828d154,4340720c,821de14,3) at 0x8099eef
signalintr(3,2,823bc1c,1,43407574) at 0x821858b
cons_unlock(1c,0,43407228,2,821ddfc) at 0x821de14
What is happening is that the SIGWINCH happens to hit a window where
no critical section is being held. That causes it to call
sched_ithd() instead of flagging the interrupt for future action.
sched_ithd() tries to send an IPI, but because the cpus have been
stopped cold by the debugger the IPI never gets sent. Once the
IPI function FIFO fills up it goes into that loop waiting for
the pending IPIs to be processed (which they never are because all
the other cpus are stopped).
The fix is very simple. I need only adjust the DDB code to enter
a critical section before it stops the cpus and exit it after it
restarts the cpus. I'll commit the fix tonight.
Excellent sleuthing!
-Matt
Matthew Dillon
<dillon at backplane.com>
More information about the Bugs
mailing list