crash with network01.patch

Matthew Dillon dillon at apollo.backplane.com
Tue Nov 29 09:44:35 PST 2005


:After patching for the previously reported crash regarding
:tcp_syncache.bucket_limit, I also installed the network01.patch,
:and saw this panic this morning.  I realize that an updated patchset
:has been committed, but I'm not sure if this problem was fixed.  Kernel
:and core are on leaf in ~pavalos/crash/*.10.
:
:Matt:  This is the same thing I emailed you about this morning, except
:that I previously said I had network02.patch, when in fact I had #1.
:
:--Peter

    Only one subsystem uses SF based mbufs right now and that is sendfile().

    It looks like sf_buf_mfree() is being called too many times but there's
    a very good chance that the problem is related to the MPSAFE code.  From
    your crash you were running on a 2-cpu box with intr_mpsafe set to 1.

    It looks like the problem is due to an SMP race from sf_buf_mfree()
    being called without the big giant lock being held.  This crash should
    theoretically not occur if intr_mpsafe is set to 0.  I will go through
    all the external mbuf code today and come up with a solution.  I totally
    forgot about external mbuf management when I did the first patch.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>

:panic: assertion: sfm->mref_count > 0 in sf_buf_mfree
:mp_lock = 00000001; cpuid = 1; lapic.id = 06000000
:boot() called on cpu#1
:Uptime: 2d9h40m41s
:
:dumping to dev #da/0x20001, offset 378927
:
:[...]
:
:#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:527
:527             if (dumping++) {
:dumpsys () at /usr/src/sys/kern/kern_shutdown.c:527
:527             if (dumping++) {
:(kgdb) bt
:#0  dumpsys () at /usr/src/sys/kern/kern_shutdown.c:527
:#1  0xc0190241 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:360
:#2  0xc01907d9 in panic (fmt=0xc0309fac "assertion: sfm->mref_count > 0 in %s") at /usr/src/sys/kern/kern_shutdown.c:673
:#3  0xc01c6ba2 in sf_buf_mfree (arg=0xebe10d80) at /usr/src/sys/kern/uipc_syscalls.c:1268
:#4  0xc01be30b in m_free (m=0xecb86900) at /usr/src/sys/kern/uipc_mbuf.c:792
:#5  0xc01be3dd in m_freem (m=0xecb86900) at /usr/src/sys/kern/uipc_mbuf.c:822
:#6  0xc01c4a64 in sbdrop (sb=0xda814440, len=40) at /usr/src/sys/kern/uipc_socket2.c:907
:#7  0xc0206ea1 in tcp_input (m=0xea3ade00) at /usr/src/sys/netinet/tcp_input.c:2150
:#8  0xc01ff27e in transport_processing_oncpu (m=0xea3ade00, hlen=20, ip=0x0, nexthop=0x0) at /usr/src/sys/netinet/ip_input.c:421
:#9  0xc01ffd2d in ip_input (m=0xea3ade00) at /usr/src/sys/netinet/ip_input.c:1121
:#10 0xc01ff2f6 in ip_input_handler (msg0=0xc3e063c0) at /usr/src/sys/netinet/ip_input.c:452
:#11 0xc020a1a2 in tcpmsg_service_loop (dummy=0x0) at /usr/src/sys/netinet/tcp_subr.c:391
:#12 0xc019779f in lwkt_create (func=0, arg=0x0, tdp=0x1000, template=0x0, tdflags=---Can't read userspace from dump, or kernel process---
:
:) at /usr/src/sys/kern/lwkt_thread.c:1362
:Previous frame inner to this frame (corrupt stack?)





More information about the Bugs mailing list