kernel panic

Joe Talbott josepht at cstone.net
Fri May 11 13:16:45 PDT 2007


On Fri, May 11, 2007 at 12:49:20PM -0700, Matthew Dillon wrote:
> 
> :The strange thing is I was rebooting my laptop (via icewm) when this
> :occurred.  The interface is re(4) according to the kernel buffer output
> :which follows.  
> :
> :Joe
> 
>     I'm guessing there's an issue with re_init() or re_stop() that is
>     possibly being triggered by setting the IP address.
> 
>     re_init() for the RE interface looks like is doing some dangerous
>     things... if there is DMA still operating while it is trying to
>     reinitialize the device, that could be causing the NMI.  It seems to be
>     writing 0x00 to the command register which I guess is supposed to stop
>     device operation, but it is not waiting for the device to actually stop
>     operating before it begins to free the TX and RX rings.
> 
>     Most network controllers these days are actually microcontrollers,
>     which means that commands do not instantaniously take effect when
>     you write to the command register.  Usually only the interrupt
>     control registers are hardwired.
> 
>     I got two questions.  First, when you ifconfig the interface with a
>     new IP address does it normally pause before returning?  That would
>     indicate that is is in fact doing a full device reset when configuring
>     an IP address.  Second, can you reproduce the problem?  Perhaps by
>     re-configuring the device's IP address over and over again in a loop?

There is a small delay <2s.  I ran a loop that switched between two
IPs for about 15 minutes and nothing happened.  

The kernel buffer output in the corefile was from months ago.  I only
remembered because I did the same thing this time; shutdown now;
umount /home; ifconfig re0 ...  I don't know how this can be in a dump
months after the fact unless there is stale data in my swap partition
from my last coredump that hasn't been overwritten since I don't do
very much swapping.  This idea may be completely wrong.  I am 100%
certain that I'm not looking at a stale dump as strings on the kernel
and vmcore show them as being from May 9, 2007.  I am also certain
that I was not ifconfig'ing any interface when this happened.

Joe

> 
>     We may be able to 'fix' the problem simply by introducing a delay
>     after writing 0x00 to RE_COMMAND, or by calling re_reset() as part
>     of re_stop(), but I'd like a way to verify that doing so will actually
>     fix the problem.
> 
> 					-Matt
> 					Matthew Dillon 
> 					<dillon at backplane.com>





More information about the Bugs mailing list