nfs + msdosfs = crashes & panics

Matthew Dillon dillon at apollo.backplane.com
Tue Apr 13 15:25:06 PDT 2004


:OK, it looks like this part was a red herring.  Reading the same set of
:files from a UFS-backed share caused the same lockup:
:
:The client has an ISA NIC, so (even though the server is far from a
:high-performance machine) I followed the suggestion in the FreeBSD
:handbook and added "-r=1024" to the client's fstab.  Now, I can't seem
:to reproduce the lockup with either msdosfs or UFS backing, which is
:good, sorry for the distraction.
:
:As for writing, though, the bug looks genuine.

    Ok, we'll set that aside.  It could be a DMA issue or an MTU/large-packet
    issue.  Many so-called NFS lockups are due to bad cables that can't
    handle large packets and NICs which cannot handle large packets or 
    many small side-by-side packets (or side by side packets just in general),
    and so forth.   It sounds like you may have a cable or NIC issue due
    to getting a whole bunch of packet fragments all once.  A standard NFS
    packet has an 8K data payload which results in 5 or 6 physical packets
    on the wire.  There are many older (ISA primarily) NICs that just can't
    handle that.

:Writing files to the UFS-backed share does not cause the panic.  But
:writing to an msdosfs-backed share does causes a "type 12 trap" which
:causes DDB to come up, and here's what I get on the console:
:
:
:kernel: type 12 trap, code=0
:Stopped at	msdosfs_write+0x31:	movl	0x34(%edx),%ebx
:db> trace
:msdosfs_write(c9274934) at msdosfs_write+0x31
:nfsrv_write(c0b48b08,c0b51a00,c481f800,c9274abc,0) at nfsrv_write+0x8e8
:nfssvc_nsfd(c9274b20,807d720,c481f800,0,c44bfa40) at nfssvc_nsfd+0x51a
:nfssvc(c9274c4c,4,0,0,c9274d20) at nfssvc+0x6dd
:syscall2(2f,2f,2f,0,0) at syscall2+0x24e
:Xint0x80_syscall() at Xint0x80_syscall+0x2a
:db> panic
:...
:I can't seem to coax debug symbols out of gdb when I run it on the crash
:dump, even though I'm 100% certain my kernel config contains
:"makeoptions DEBUG=-g" (I saw -g passed along during the kernel build,
:too.)  Here's what gdb does give me:
:

    That works, but it does not install the debug-symboled binary.  It
    keeps it in your kernel build directory as 'kernel.debug'.  If you
    are using make buildkernel, then the kernel build directory is 
    probably /usr/obj/usr/src/sys/<KERNELNAME>.  Note that the kernel.debug
    must be from the exact build that was installed on the machine.

    If you can upload the core and kernel.debug to leaf I'll take a look
    at the crash.

:Fatal trap 12: page fault while in kernel mode
:fault virtual address	= 0x34
:fault code		= supervisor read, page not present
:instruction pointer	= 0x8:0xc01c0009

    It's obviously a null-pointer indirection of some sort.

:Not much point uploading the kernel image if it doesn't have symbols in
:it, but I do think we can rule out Jeffrey's network-related work here,
:it looks specifically like a nfs->msdosfs interaction gone bad.  I'll
:try to look into the source myself later on, on the off chance there's
:some obvious mismatch between nfsrv_write() and msdosfs_write().
:
:-Chris

    Find the kernel.debug.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Bugs mailing list