All new processes stuck in "flstik" state

Rumko rumcic at gmail.com
Wed Dec 10 00:46:16 PST 2008


Matthew Dillon wrote:

> 
> :Most of the time when I see processes in flstik state, the machine recovers,
> :but have managed to find how to reproduce the situation, where all new
> :processes get stuck in flstik state (even ssh-ing into the machine creates a
> :new sshd process which gets stuck in flstik state and won't continue, can't
> :login locally, etc.).
> :
> :The machine has /boot UFS and / hammer with several PFS' (done using
> :the /usr/share/examples/rconfig/hammer.sh, but my /var/tmp isn't a PFS, but
> :a softlink to /tmp).
> :
> :If I run http://pastebin.ca/1279548 (with http://pastebin.ca/1279549 as the
> :pkgsrc.img.label) 2, sometimes 3 times, all new processes get stuck in
> :flstik state during installworld.
> :Panicked the machine and uploaded the cores (the first time it happened i
> :panicked the machine and made the memory dumps and after succesfully
> :reproducing the situation i did it again) to
> :leaf:~rumko/crash/{kernel,vmcore}.{0,1}
> :--
> :Regards,
> :Rumko
> 
>     I'm guessing that the problem is due to running HAMMER on the VN
>     backed by a file on another HAMMER filesystem.
> 
>     The kgdb on leaf is unable to list the threads in your dump, probably
>     due to structural mismatches.  Could you do it on your box and post a
>     backtrace of the threads stuck in flstik?  I am going to guess that
>     the bd_wait() they are stuck in is deep inside VN, probably the
>     path:  VNDEVICE->VOP_WRITE->(HAMMER)->bwillwrite()->bd_wait().
> 
>     You 'kgdb kernel.1 vmcore.1' and do an 'info thread'.  The threads
>     we are interested in are for threads 0xd8fa2000, 0xd8fa2600,
>     0xd8fb5300, 0xdce45400, 0xd8fb4800, and 0xdce46d00.  For each one
>     find the thread number and do 'thread <number>' and 'back' (abbreviation
>     for backtrace).
> 
>     If that is the problem the solution is really simple, I can just
>     pass a flag in the VOP_WRITE to tell HAMMER not to call bwillwrite().
> 
> -Matt
> Matthew Dillon
> <dillon at backplane.com>

0xd8fa2000:
#0  0xc01cb977 in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:725
#1  0xc01d123a in tsleep (ident=0xc0456234, flags=0, wmesg=0xc03cd749 "flstik",
timo=100) at /usr/src/sys/kern/kern_synch.c:489
#2  0xc01fe282 in bd_wait (totalspace=16384) at /usr/src/sys/kern/vfs_bio.c:409
#3  0xc01c8be8 in bwillwrite (bytes=65536)
at /usr/src/sys/kern/kern_iosched.c:66
#4  0xc0311e2d in hammer_vop_write (ap=0xdaa2b858)
at /usr/src/sys/vfs/hammer/hammer_vnops.c:393
#5  0xc0218ffe in vop_write (ops=0xd2744e90, vp=0xdd29c068, uio=0xdaa2b8f0,
ioflag=256, cred=0xdd04f1e8) at /usr/src/sys/kern/vfs_vopops.c:351
#6  0xc0196af7 in vnstrategy (ap=0xdaa2b930)
at /usr/src/sys/dev/disk/vn/vn.c:396
#7  0xc01ad0c2 in dev_dstrategy_chain (dev=0xdcef9c58, bio=0xc3ced39c)
at /usr/src/sys/kern/kern_device.c:250
#8  0xc021c8eb in spec_strategy (ap=0xdaa2b97c)
at /usr/src/sys/vfs/specfs/spec_vnops.c:506
#9  0xc021c6b9 in spec_vnoperate (ap=0xdaa2b97c)
at /usr/src/sys/vfs/specfs/spec_vnops.c:136
#10 0xc0218fa2 in vop_strategy (ops=0xd2744fb0, vp=0xdd28f868, bio=0xc3ced39c)
at /usr/src/sys/kern/vfs_vopops.c:659
#11 0xc01fd492 in vn_strategy (vp=0x0, bio=0x0)
at /usr/src/sys/kern/vfs_bio.c:3082
#12 0xc030184f in hammer_io_direct_write (hmp=0xdd569000, record=0xc3a17c80,
bio=0xc3ced32c) at /usr/src/sys/vfs/hammer/hammer_io.c:1211
#13 0xc0310642 in hammer_vop_strategy (ap=0xdaa2bb34)
at /usr/src/sys/vfs/hammer/hammer_vnops.c:2663
#14 0xc0218fa2 in vop_strategy (ops=0xd274b190, vp=0xdf14ee68, bio=0xc3ced32c)
at /usr/src/sys/kern/vfs_vopops.c:659
#15 0xc01fd492 in vn_strategy (vp=0x0, bio=0x0)
at /usr/src/sys/kern/vfs_bio.c:3082
#16 0xc0200ef4 in bwrite (bp=0xc3ced2fc) at /usr/src/sys/kern/vfs_bio.c:790
#17 0xc0201c1d in bawrite (bp=0xc3ced2fc) at /usr/src/sys/kern/vfs_bio.c:964
#18 0xc020f1fa in vfsync_bp (bp=0xc3ced2fc, data=0xdaa2bc00)
at /usr/src/sys/kern/vfs_subr.c:828
#19 0xc020c38a in buf_rb_tree_RB_SCAN (head=0xdf14eecc, scancmp=0xc020d3ce
<vfsync_data_only_cmp>, callback=0xc020f030 <vfsync_bp>, data=0xdaa2bc00)
    at /usr/src/sys/kern/vfs_subr.c:139
#20 0xc020f33c in vfsync (vp=0xdf14ee68, waitfor=2, passes=1, checkdef=0,
waitoutput=0) at /usr/src/sys/kern/vfs_subr.c:678
#21 0xc03115dd in hammer_vop_fsync (ap=0xdaa2bc58)
at /usr/src/sys/vfs/hammer/hammer_vnops.c:199
#22 0xc02182fe in vop_fsync (ops=0xd274b190, vp=0xdf14ee68, waitfor=2)
at /usr/src/sys/kern/vfs_vopops.c:449
#23 0xc03082a2 in hammer_sync_scan2 (mp=0xd8f651d8, vp=0x0, data=0xdaa2bcf0)
at /usr/src/sys/vfs/hammer/hammer_ondisk.c:1518
#24 0xc0210898 in vmntvnodescan (mp=0xd8f651d8, flags=17, fastfunc=0xc030814f
<hammer_sync_scan1>, slowfunc=0xc030826e <hammer_sync_scan2>,
    data=0xdaa2bcf0) at /usr/src/sys/kern/vfs_mount.c:1005
#25 0xc030820c in hammer_sync_hmp (hmp=0xdd569000, waitfor=4)
at /usr/src/sys/vfs/hammer/hammer_ondisk.c:1474
#26 0xc030e909 in hammer_vfs_sync (mp=0xd8f651d8, waitfor=4)
at /usr/src/sys/vfs/hammer/hammer_vfsops.c:907
#27 0xc02116dc in sync_fsync (ap=0xdaa2bd40)
at /usr/src/sys/kern/vfs_sync.c:410
#28 0xc02182fe in vop_fsync (ops=0xc041fa20, vp=0xddb42168, waitfor=4)
at /usr/src/sys/kern/vfs_vopops.c:449
#29 0xc02118b8 in sched_sync () at /usr/src/sys/kern/vfs_sync.c:214
#30 0xc01b750b in suspend_kproc (td=Cannot access memory at address 0x8
) at /usr/src/sys/kern/kern_kthread.c:158


0xd8fa2600:
#0  0xc01cb977 in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:725
#1  0xc01d120c in tsleep (ident=0xc0456234, flags=0, wmesg=0xc03cd749 "flstik",
timo=100) at /usr/src/sys/kern/kern_synch.c:478
#2  0xc01fe282 in bd_wait (totalspace=16384) at /usr/src/sys/kern/vfs_bio.c:409
#3  0xc01c8be8 in bwillwrite (bytes=16384)
at /usr/src/sys/kern/kern_iosched.c:66
#4  0xc0311e2d in hammer_vop_write (ap=0xdce6ab5c)
at /usr/src/sys/vfs/hammer/hammer_vnops.c:393
#5  0xc0218ffe in vop_write (ops=0xd2744e90, vp=0xd2589968, uio=0xdce6ac98,
ioflag=8323075, cred=0xc39a58c8) at /usr/src/sys/kern/vfs_vopops.c:351
#6  0xc021801e in vn_write (fp=0xd61e7c80, uio=0xdce6ac98, cred=0xc39a58c8,
flags=0) at /usr/src/sys/kern/vfs_vnops.c:715
#7  0xc01e2dd8 in kern_pwritev (fd=9, auio=0xdce6ac98, flags=0, res=0xdce6acf0)
at /usr/src/sys/sys/file2.h:72
#8  0xc01e3414 in sys_writev (uap=0xdce6acf0)
at /usr/src/sys/kern/sys_generic.c:389
#9  0xc0393bce in syscall2 (frame=0xdce6ad40)
at /usr/src/sys/platform/pc32/i386/trap.c:1386
#10 0xc037dab6 in Xint0x80_syscall ()
at /usr/src/sys/platform/pc32/i386/exception.s:876
#11 0x280c5654 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


0xd8fb5300:
#0  0xc01cb977 in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:725
#1  0xc01d120c in tsleep (ident=0xc0456234, flags=0, wmesg=0xc03cd749 "flstik",
timo=100) at /usr/src/sys/kern/kern_synch.c:478
#2  0xc01fe282 in bd_wait (totalspace=16384) at /usr/src/sys/kern/vfs_bio.c:409
#3  0xc01c8bc8 in bwillinode (n=1) at /usr/src/sys/kern/kern_iosched.c:87
#4  0xc021744a in vn_open (nd=0xdd4c6c80, fp=0xd61e8d60, fmode=514, cmode=420)
at /usr/src/sys/kern/vfs_vnops.c:159
#5  0xc021486d in kern_open (nd=0xdd4c6c80, oflags=513, mode=420,
res=0xdd4c6cf0) at /usr/src/sys/kern/vfs_syscalls.c:1724
#6  0xc0214a42 in sys_open (uap=0xdd4c6cf0)
at /usr/src/sys/kern/vfs_syscalls.c:1834
#7  0xc0393bce in syscall2 (frame=0xdd4c6d40)
at /usr/src/sys/platform/pc32/i386/trap.c:1386
#8  0xc037dab6 in Xint0x80_syscall ()
at /usr/src/sys/platform/pc32/i386/exception.s:876
#9  0x280e3f54 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


0xdce45400:
#0  0xc01cb977 in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:725
#1  0xc01d123a in tsleep (ident=0xdd569194, flags=0, wmesg=0xc03d56c5 "hmrfl1",
timo=0) at /usr/src/sys/kern/kern_synch.c:489
#2  0xc0301fc0 in hammer_io_wait_all (hmp=0xdd569000,
ident=0xc03d56c5 "hmrfl1") at /usr/src/sys/vfs/hammer/hammer_io.c:160
#3  0xc02fbf22 in hammer_flusher_finalize (trans=0xdd569114, final=1)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:641
#4  0xc02fc7e6 in hammer_flusher_master_thread (arg=0xdd569000)
at /usr/src/sys/vfs/hammer/hammer_flusher.c:360
#5  0xc01cb142 in lwkt_deschedule_self (td=Cannot access memory at address 0x8
) at /usr/src/sys/kern/lwkt_thread.c:228
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


0xd8fb4800:
#0  0xc01cb977 in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:725
#1  0xc01d120c in tsleep (ident=0xc0456234, flags=0, wmesg=0xc03cd749 "flstik",
timo=100) at /usr/src/sys/kern/kern_synch.c:478
#2  0xc01fe282 in bd_wait (totalspace=16384) at /usr/src/sys/kern/vfs_bio.c:409
#3  0xc01c8bc8 in bwillinode (n=1) at /usr/src/sys/kern/kern_iosched.c:87
#4  0xc021744a in vn_open (nd=0xdd41cc80, fp=0xd61e8b68, fmode=514, cmode=420)
at /usr/src/sys/kern/vfs_vnops.c:159
#5  0xc021486d in kern_open (nd=0xdd41cc80, oflags=513, mode=420,
res=0xdd41ccf0) at /usr/src/sys/kern/vfs_syscalls.c:1724
#6  0xc0214a42 in sys_open (uap=0xdd41ccf0)
at /usr/src/sys/kern/vfs_syscalls.c:1834
#7  0xc0393bce in syscall2 (frame=0xdd41cd40)
at /usr/src/sys/platform/pc32/i386/trap.c:1386
#8  0xc037dab6 in Xint0x80_syscall ()
at /usr/src/sys/platform/pc32/i386/exception.s:876
#9  0x280c1f54 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)


0xdce46d00:
#0  0xc01cb977 in lwkt_switch () at /usr/src/sys/kern/lwkt_thread.c:725
#1  0xc01d120c in tsleep (ident=0xc0456234, flags=0, wmesg=0xc03cd749 "flstik",
timo=100) at /usr/src/sys/kern/kern_synch.c:478
#2  0xc01fe282 in bd_wait (totalspace=16384) at /usr/src/sys/kern/vfs_bio.c:409
#3  0xc01c8bc8 in bwillinode (n=1) at /usr/src/sys/kern/kern_iosched.c:87
#4  0xc021744a in vn_open (nd=0xddc6dc80, fp=0xdccc40e0, fmode=1538, cmode=420)
at /usr/src/sys/kern/vfs_vnops.c:159
#5  0xc021486d in kern_open (nd=0xddc6dc80, oflags=1537, mode=438,
res=0xddc6dcf0) at /usr/src/sys/kern/vfs_syscalls.c:1724
#6  0xc0214a42 in sys_open (uap=0xddc6dcf0)
at /usr/src/sys/kern/vfs_syscalls.c:1834
#7  0xc0393bce in syscall2 (frame=0xddc6dd40)
at /usr/src/sys/platform/pc32/i386/trap.c:1386
#8  0xc037dab6 in Xint0x80_syscall ()
at /usr/src/sys/platform/pc32/i386/exception.s:876
#9  0x28144f54 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
-- 
Regards,
Rumko





More information about the Bugs mailing list