Re: Random server crashes every few weeks (smp_invltlb: endless loop […] retrysmp_invltlb: ipi sent)

Matthew Dillon dillon at backplane.com
Sun Jul 17 10:21:04 PDT 2016


The smp_invltlb() issue should hopefully be fixed in the latest master but
the code has not been well-tested yet.

The cause of the original issue you reported is not entirely known (its an
issue with one of the other cpu cores, not the one that generated the
message and backtrace), but it is usually able to self-recover from it.
The new code should work a lot better.

-Matt

On Fri, Jul 15, 2016 at 3:54 AM, Stefan Unterweger <
232.20711 at chiffre.aleturo.com> wrote:

> * Stefan Unterweger on Thu, Jul 07, 2016 at 09:31:47AM +0200:
> > > >> Virtio (for block storage devices) could be the cause.  There are
> known
> > > >> bugs in the DragonFly driver for virtio which haven't been tracked
> down yet
> > > >> (not enough of the devs are using virtual hosting to be able to
> reproduce
> > > >> the problem in a debugable way).
> > >
> > > Can you try the latest master?  I think it has been fixed there.
> >
> > So, the patch is finally in.  At least from the first superficial
> > glances, the system now feels more stable than before: a big Synth
> > upgrade marathon previously easily got the machine down, yesterday it
> > went through.  I’ll have to do a few more stress tests, perhaps on a
> > second machine, before I can tick that off.
>
> With the ‚gentle‘ stress tests done so far the system -seems- stable,
> but it looks trecherous.  Today I have got another semi-crash under
> moderate load; this is the trace that has just appeared in syslog:
>
> | Jul 15 12:44:01 rhaal kernel: Trace beginning at frame 0xffffffe08025d580
> | Jul 15 12:44:01 rhaal kernel: smp_invltlb() at smp_invltlb+0x229
> 0xffffffff80a2d3c1
> | Jul 15 12:44:01 rhaal kernel: smp_invltlb() at smp_invltlb+0x229
> 0xffffffff80a2d3c1
> | Jul 15 12:44:01 rhaal kernel: pmap_qenter() at pmap_qenter+0x6d
> 0xffffffff80a260f0
> | Jul 15 12:44:01 rhaal kernel: allocbuf() at allocbuf+0x5eb
> 0xffffffff80669b92
> | Jul 15 12:44:01 rhaal kernel: getblk() at getblk+0x467 0xffffffff8066ccff
> | Jul 15 12:44:01 rhaal kernel: hammer_io_inval() at hammer_io_inval+0x84
> 0xffffffff80821c6e
> | Jul 15 12:44:01 rhaal kernel: hammer_del_buffers() at
> hammer_del_buffers+0x1cb 0xffffffff8082e06c
> | Jul 15 12:44:01 rhaal kernel: hammer_io_direct_wait() at
> hammer_io_direct_wait+0xd2 0xffffffff808231f4
> | Jul 15 12:44:01 rhaal kernel: hammer_ip_sync_record_cursor() at
> hammer_ip_sync_record_cursor+0xcb 0xffffffff80829556
> | Jul 15 12:44:01 rhaal kernel: hammer_sync_record_callback() at
> hammer_sync_record_callback+0x24d 0xffffffff8081b45c
> | Jul 15 12:44:01 rhaal kernel: hammer_rec_rb_tree_RB_SCAN() at
> hammer_rec_rb_tree_RB_SCAN+0xfa 0xffffffff80826844
> | Jul 15 12:44:01 rhaal kernel: hammer_sync_inode() at
> hammer_sync_inode+0x27e 0xffffffff8081db74
> | Jul 15 12:44:01 rhaal kernel: hammer_flusher_flush_inode() at
> hammer_flusher_flush_inode+0x55 0xffffffff80819f84
> | Jul 15 12:44:01 rhaal kernel: hammer_fls_rb_tree_RB_SCAN() at
> hammer_fls_rb_tree_RB_SCAN+0xfc 0xffffffff808190fd
> | Jul 15 12:44:01 rhaal kernel: hammer_flusher_slave_thread() at
> hammer_flusher_slave_thread+0x7a 0xffffffff80819231
> | Jul 15 12:44:01 rhaal kernel: smp_invltlb: endless loop 00000000
> 00000002, rflags 0000000000000286 retrysmp_invltlb: ipi sent
>
> The machine thankfully is still running.
>
> The kernel is master edca023af (or at most one or two commits after
> that).  I‘ll try out today‘s master on a second machine tomorrow, but
> until booting is still flakey, it’s difficult to replicate the setup to
> reliably provoke a crash.  The dmesg is attached for reference.
>
>
> Thanks,
>     Stefan
>
>
> --
> Die Internetbleibe.  Schick, magisch, leistungsstark.
> https://internetbleibe.de/
>
> medoly media UG (haftungsbeschränkt), Hausburgstr. 13, 10249 Berlin
>
> E-Mail: info at medolymedia.de
> Telefon 030 - 609 826 560
> Fax 030 - 609 826 569
> Website: https://medolymedia.de/
>
> Geschäftsführer: Matthias Nothhaft, HRB 131198 (Amtsgericht
> Berlin-Charlottenburg), Sitz: Berlin, USt-ID: DE275221203
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20160717/bde3b649/attachment-0004.html>


More information about the Users mailing list