Random server crashes every few weeks (smp_invltlb: endless loop […] retrysmp_invltlb: ipi sent)

Stefan Unterweger 232.20711 at chiffre.aleturo.com
Mon Sep 12 13:33:47 PDT 2016


Hi!

I haven‘t seen your post until now.

I have finally managed to set up a test machine and am currently
throwing as many heavy jobs at it that I can think of.

In general it feels more stable, but I still get crashes.  The crashes
are different though.  In the last run, the machine just rebooted out of
the blue, with no salvageable kind of trace or anything (it crashed
again during boot, the dmesg output was lost after a hard reset).  I
have restarted the machine and am running it again, hopefully I can
catch something this time.

As for booting:  I still occasionally get those weird crashes at the
dm_crypt stage, but they look different now.  I have attached the panic
and a trace of the most recent one, with the transcripts below:

Here‘s the panic:
| […]
|  failed in mpipe_done at /usr/src/sys/kern/kern_mpipe.c:131
| cpuid = 5
| Trace beginning at frame 0xffffffe0a11ff528
| panic() at panic+0x261 0xffffffff8060245d
| panic() at panic+0x261 0xffffffff8060245d
| mpipe_done() at mpipe_done+0x3f 0xffffffff80600f08
| dm_target_crypt_destroy() at dm_target_crypt_destroy+0x8b 0xffffffff92976cdc
| dm_table_destroy() at dm_table_destroy+0xda 0xffffffff827ab374
| dm_dev_destroy() at dm_dev_destroy+0x24 0xffffffff827a7bf6
| Debugger("panic")
| cpuid = 0; lapic->id = 00000000
| instruction pointer = 0x8:0xffffffff805f54ac
| stack pointer       = 0x10:0xffffffe09fef57a0
| frame pointer       = 0x10:0xffffffe09fef57f0
| code segment        = base 0x0, limit 0xfffff, type 0x1b
|                     = DPL 0, pres 1, long 1, def32 0, gran 1
| processor eflags    = interrupt enabled, resume, IOPL = 0
| current process     = idle
| current thread      = pri 28 (CRIT)
| kernel: type 9 trap, code=0
| 
| CPU0 stopping CPUs: 0x0000003e
|  stopped
| Stopped at objcache_get+0x75: addl $0x1,0xd0(%r12)

… and the trace:
| objcache_get() at objcache_get+0x75 0xffffffff805f54ac
| essiv_ivgen() at essiv_ivgen+0x69 0xffffffff82876d6d
| dmtc_crypto_read_start() at dmtc_crypto_read_start+0x18e 0xffffffff828766ac
| dmtc_bio_read_done() at dmtc_bio_read_done+0x2c 0xffffffff828766ff
| biodone() at biodone+0x110 0xffffffff8066790e
| vtblk_vq_intr() at vtblk_vq_intr+0x129 0xffffffff808c9319
| virtqueue_intr() at virtqueue_intr+0x26 0xffffffff808cf05d
| vtpci_legacy_intr() at vtpci_legacy_intr+0x58 0xffffffff808cda47
| lwkt_serialize_handler_call() at lwkt_serialize_handler_call+0x79 0xffffffff8061bce1
| ithread_handler() at ithread_handler+0x242 0xffffffff805d3ad7


    Stefan



* Matthew Dillon on Mon, Sep 05, 2016 at 02:25:12PM -0700:
> A fix for at least one indefinite wait buffer bug has gone into
> master.(commit 10c39de26c1356d0).  It has not been put into the release
> branch yet as it needs testing.  It's possible that this is related to the
> bug you reported because the bug can occur when the system is paging
> to/from swap while also reading and writing to the filesystem.  So, e.g.
> any swap activity or paging during a buildworld or something like that.
> 
> -Matt

-- 
▪ Die Internetbleibe.  Schick, magisch, leistungsstark.  https://internetbleibe.de/
▪ medoly media UG (haftungsbeschränkt) | Hausburgstraße 13, 10249 Berlin
▪ info at medolymedia.de | https://medolymedia.de/ | Tel. 030 609 826‒560 | Fax …‒569
▪ Geschäftsführer: Matthias Nothhaft | HRB 131198 (Amtsgericht Berlin-Charlottenburg), Sitz: Berlin, USt-ID: DE275221203
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 5GIc+bh13 ? ledostar at nur-ab-sal xwd-#00041.png
Type: image/png
Size: 30485 bytes
Desc: not available
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20160912/808ae23f/attachment-0010.png>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: 5GIc+bh13 ? ledostar at nur-ab-sal xwd-#00042.png
Type: image/png
Size: 31641 bytes
Desc: not available
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20160912/808ae23f/attachment-0011.png>


More information about the Users mailing list