Re: Random server crashes every few weeks (smp_invltlb: endless loop […] retrysmp_invltlb: ipi sent)

Matthew Dillon dillon at backplane.com
Fri May 27 17:23:42 PDT 2016


Ok, Zachary noted that ivadasz had a patch.  Imre and I went over it on IRC
and he committed the patch to master.  I also committed some additional
changes so it would be great if anyone using master + virtio in a
virtual-hosted environment can [re]test the changes.

There are likely going to be numerous other issues with virtual hosting not
yet addressed.

-Matt

On Fri, May 27, 2016 at 10:08 AM, Matthew Dillon <dillon at backplane.com>
wrote:

> Virtio (for block storage devices) could be the cause.  There are known
> bugs in the DragonFly driver for virtio which haven't been tracked down yet
> (not enough of the devs are using virtual hosting to be able to reproduce
> the problem in a debugable way).
>
> -Matt
>
> On Fri, May 27, 2016 at 7:39 AM, Steve Petrie, P.Eng. <
> apetrie at aspetrie.net> wrote:
>
>> Greetings To DragonFlyBSD List,
>>
>> The subject of random server crashes with DragonFly running running on a
>> virtualized host machine, is of great interest (and concern) to me. Caveat:
>> I am an (almost) complete DragonFly newbie.
>>
>> Please see my commens inline below.
>>
>> Steve
>>
>> ----- Original Message ----- From: "Stefan Unterweger" <
>> 232.20711 at chiffre.aleturo.com>
>> To: "Matthew Dillon" <dillon at backplane.com>
>> Cc: <users at dragonflybsd.org>
>> Sent: Friday, May 27, 2016 3:38 AM
>> Subject: Re: Random server crashes every few weeks (smp_invltlb: endless
>> loop […] retrysmp_invltlb: ipi sent)
>>
>>
>>
>> * Matthew Dillon on Thu, May 26, 2016 at 11:00:18AM -0700:
>>>
>>>> It's really hard to say from something which is virtually hosted.  It
>>>> kinda
>>>> sounds like the virtual host isn't assigning enough of its own cpus to
>>>> the
>>>> virtual host.  The fact that DragonFly is complaining about
>>>> smp_invltlb()
>>>> implies that the host's virtualized cpu threads are not getting
>>>> scheduled
>>>> properly.
>>>>
>>>> One thing to note is that we do not do any instruction escapes to hint
>>>> to
>>>> virtual hosts when a cpu is in a tight loop waiting for synchronization.
>>>> It would be nice if we had some support for that, it would probably make
>>>> DFly play better on virtualized systems.
>>>>
>>>
>>> This is an interesting suggestion, which at least would explain at least
>>> some of the cases where I’ve experienced the crashes (the daily HAMMER
>>> cronjob, heavy paging under stress, I/O bursts and so on).
>>>
>>> So in effect, could it be that the crashes are more likely as either my
>>> own server comes under load or some -other- server who happens to run in
>>> the same hypervisor?
>>>
>>> Would this warrant opening a ticket with Profitbricks, or is it just as
>>> likely that I’m wasting my time and will only get a response along the
>>> lines of ‘Use Linux; Dragonfly BSD most certainly is not supported’?
>>>
>>> I suggest setting the number of cores to 1.  That will get rid of all SMP
>>>> interplay and hopefully remove the issues the virtual host is choking
>>>> on.
>>>>
>>>
>>> Interestingly enough, I have seen the opposite so far.  At first, I have
>>> run the server on only one core, to save money and because it doesn’t
>>> really yet need any more.  When on one core, it still freezes, along
>>> approximately the same pattern, but I never got a trace there.
>>>
>>> My guess then was that perhaps there would have been some odd race
>>> condition between paging, HAMMER and dm_crypt—adding another core
>>> temporarily seemed more stable and then regressed back to the mean.
>>>
>>> I will try to set up another VM to see whether I can reliably reproduce
>>> such a crash.
>>>
>>>
>> After a great deal of research, I chose DragonFly as the OS for a new
>> website (not yet online). Three main attributes drew me to DragonFly: 1.
>> reputation for reliability and speed, 2. hammer file system, 3.
>> responsiveness of DragonFly open source community.
>>
>> However, my business plan for this new website (not yet online), requires
>> starting out hosting it on a VM under QEMU / KVM virtualization, because I
>> cannot justify the much higher cost of dedicated server hosting hardware.
>> And I like the brutally competitive quasi-commoditized hosting services
>> market for QEMU / KVM virtualization offerings.
>>
>> I do have an experimental working DragonFly installation on a QEMU / KVM
>> VM hosted at Elastic Hosts www.elastichosts.com I access this VM through
>> TightVNC and I get a DragonFly console using PuTTY.
>>
>> But I had to suspend work on testing this DragonFly VM installation, due
>> to other business priorities. I hope to get back to it later in 2016.
>>
>> However, I can highly recommend Elastic Hosts for their solid cloud
>> infrastructure and their strong customer support.
>>
>> So if Stefan wants to expand his testing of the DragonFly crash to a VM
>> with a (probably) different underlying architecture than at ProfitBricks, I
>> would recommend giving an Elastic Hosts QEMU / KVM VM a try.
>>
>> Alternatively, if Stafan has (or develops) some simple limited
>> self-contained testing setup, for reproducing the DragonFly crash he is
>> experiencing with the ProfitBricks VM he's presently using, I would be
>> interested to try to set up the same testing scenario on my current
>> DragonFly (and later on an upgraded version of DragonFly) on the Elastic
>> Hosts VM where I presently have (an outdated version of) DragonFly
>> operational.
>>
>> Steve
>>
>>
>>
>>> Thanks for your answer,
>>>  Stefan
>>>
>>>
>>>
>>> PS: Just in case, as I’ve forgotten it previously: here’s the dmesg from
>>>    the server in question.
>>>
>>> | Copyright (c) 2003-2015 The DragonFly Project.
>>> | Copyright (c) 1992-2003 The FreeBSD Project.
>>> | Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993,
>>> 1994
>>> | The Regents of the University of California. All rights reserved.
>>> | DragonFly v4.4.1-RELEASE #2: Sun Dec  6 19:10:59 EST 2015
>>> | root at www.shiningsilence.com:
>>> /usr/obj/home/justin/release/4_4/sys/X86_64_GENERIC
>>> | TSC clock: 2600054420 Hz, i8254 clock: 1193169 Hz
>>> | CPU: AMD Opteron 62xx class CPU (2600.11-MHz K8-class CPU)
>>> |   Origin = "AuthenticAMD"  Id = 0x600f12  Stepping = 2
>>> |
>>> Features=0x783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2>
>>> |
>>> Features2=0x96982203<SSE3,PCLMULQDQ,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,AVX,VMM>
>>> |   AMD Features=0x24500800<SYSCALL,NX,MMX+,Page1GB,LM>
>>> |   AMD
>>> Features2=0x10be7<LAHF,CMP,SVM,ABM,SSE4A,MAS,Prefetch,OSVW,XOP,FMA4>
>>> |   MONITOR/MWAIT Features=0x2<INTBRK>
>>> | real memory  = 3219762176 (3070 MB)
>>> | avail memory = 2990727168 (2852 MB)
>>> | lapic: divisor index 0, frequency 500005713 Hz
>>> | SMI Frequency (worst case): 28571 Hz (35 us)
>>> | Initialize MI interrupts
>>> | wdog: In-kernel automatic watchdog reset enabled
>>> | kbd1 at kbdmux0
>>> | md0: Preloaded image <initrd.img> 15728640 bytes at 0xffffffff82739ac0
>>> | md1: Malloc disk
>>> | ACPI: RSDP 0x00000000000FC980 000014 (v00 BOCHS )
>>> | ACPI: RSDT 0x00000000BFFFBCA0 000040 (v01 BOCHS  BXPCRSDT 00000001
>>> BXPC 00000001)
>>> | ACPI: FACP 0x00000000BFFFFF80 000074 (v01 BOCHS  BXPCFACP 00000001
>>> BXPC 00000001)
>>> | ACPI: DSDT 0x00000000BFFFBCE0 00151D (v01 BXPC   BXDSDT   00000001
>>> INTL 20100528)
>>> | ACPI: FACS 0x00000000BFFFFF40 000040
>>> | ACPI: APIC 0x00000000BFFFFC60 000270 (v01 BOCHS  BXPCAPIC 00000001
>>> BXPC 00000001)
>>> | ACPI: HPET 0x00000000BFFFFC20 000038 (v01 BOCHS  BXPCHPET 00000001
>>> BXPC 00000001)
>>> | ACPI: SRAT 0x00000000BFFFF770 0004A8 (v01 BOCHS  BXPCSRAT 00000001
>>> BXPC 00000001)
>>> | ACPI: SSDT 0x00000000BFFFD8E0 001E8E (v01 BOCHS  BXPCSSDT 00000001
>>> BXPC 00000001)
>>> | ACPI: SSDT 0x00000000BFFFD870 00003D (v01 BOCHS  BXPCSSDT 00000001
>>> BXPC 00000001)
>>> | ACPI: SSDT 0x00000000BFFFD200 00066E (v01 BXPC   BXSSDTPC 00000001
>>> INTL 20100528)
>>> | cryptosoft0: <software crypto> on motherboard
>>> | aesni0: <AES-CBC,AES-XTS> on motherboard
>>> | padlock0: No ACE support.
>>> | rdrand0: No RdRand support.
>>> | acpi0: <BOCHS BXPCRSDT> on motherboard
>>> | ACPI: 4 ACPI AML tables successfully acquired and loaded
>>> | ACPI FADT: SCI testing interrupt mode ...
>>> | ACPI FADT: SCI select level/low
>>> | objcache_reclaimlist
>>> | objcache_reclaimlist
>>> | objcache_reclaimlist
>>> | objcache_reclaimlist
>>> | acpi0: Power Button (fixed)
>>> | acpi_timer0 on acpi0
>>> | acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff
>>> on acpi0
>>> | acpi_hpet0: frequency 100000000
>>> | pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
>>> | pci0: <ACPI PCI bus> on pcib0
>>> | pci_link4: Unable to route IRQs: AE_NOT_FOUND
>>> | isab0: <PCI-ISA bridge> at device 1.0 on pci0
>>> | isa0: <ISA bus> on isab0
>>> | atapci0: <Intel PIIX3 WDMA2 controller> port
>>> 0xc120-0xc12f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 1.1 on pci0
>>> | ata0: <ATA channel 0> on atapci0
>>> | ata1: <ATA channel 1> on atapci0
>>> | acd0: DVDROM <QEMU DVD-ROM/1.0> at ata1-master WDMA2
>>> | uhci0: <Intel 82371SB (PIIX3) USB controller> port 0xc0c0-0xc0df irq
>>> 11 at device 1.2 on pci0
>>> | usbus0: controller did not stop
>>> | usbus0 on uhci0
>>> | pci0: <bridge> (vendor 0x8086, dev 0x7113) at device 1.3 irq 9
>>> | vgapci0: <VGA-compatible display> mem 0xfd000000-0xfdffffff at device
>>> 2.0 on pci0
>>> | vgapci0: Boot video device
>>> | virtio_pci0: <VirtIO PCI Balloon adapter> port 0xc0e0-0xc0ff irq 11 at
>>> device 3.0 on pci0
>>> | virtio_pci1: <VirtIO PCI Block adapter> port 0xc000-0xc03f mem
>>> 0xfebf0000-0xfebf0fff irq 10 at device 5.0 on pci0
>>> | vtblk0: <VirtIO Block Adapter> on virtio_pci1
>>> | virtio_pci1: host features: 0x710006d4
>>> <EventIdx,RingIndirect,NotifyOnEmpty,Topology,WriteCache,SCSICmds,BlockSize,DiskGeometry,MaxNumSegs>
>>> | virtio_pci1: negotiated features: 0x254
>>> <WriteCache,BlockSize,DiskGeometry,MaxNumSegs>
>>> | virtio_pci2: <VirtIO PCI Block adapter> port 0xc040-0xc07f mem
>>> 0xfebf1000-0xfebf1fff irq 10 at device 6.0 on pci0
>>> | vtblk1: <VirtIO Block Adapter> on virtio_pci2
>>> | virtio_pci2: host features: 0x710006d4
>>> <EventIdx,RingIndirect,NotifyOnEmpty,Topology,WriteCache,SCSICmds,BlockSize,DiskGeometry,MaxNumSegs>
>>> | virtio_pci2: negotiated features: 0x254
>>> <WriteCache,BlockSize,DiskGeometry,MaxNumSegs>
>>> | virtio_pci3: <VirtIO PCI Network adapter> port 0xc100-0xc11f mem
>>> 0xfebf2000-0xfebf2fff irq 11 at device 7.0 on pci0
>>> | vtnet0: <VirtIO Networking Adapter> on virtio_pci3
>>> | virtio_pci3: host features: 0x711f8060
>>> <EventIdx,RingIndirect,NotifyOnEmpty,RxModeExtra,VLanFilter,RxMode,ControlVq,Status,MrgRxBuf,TxAllGSO,MacAddress>
>>> | virtio_pci3: negotiated features: 0x110f8020
>>> <RingIndirect,NotifyOnEmpty,VLanFilter,RxMode,ControlVq,Status,MrgRxBuf,MacAddress>
>>> | usbus0: 12Mbps Full Speed USB v1.0
>>> | vtnet0: MAC address: 02:01:06:f6:1b:63
>>> | add dynamic link state
>>> | virtio_pci4: <VirtIO PCI Block adapter> port 0xc080-0xc0bf mem
>>> 0xfebf3000-0xfebf3fff irq 11 at device 8.0 on pci0
>>> | vtblk2: <VirtIO Block Adapter> on virtio_pci4
>>> | virtio_pci4: host features: 0x710006d4
>>> <EventIdx,RingIndirect,NotifyOnEmpty,Topology,WriteCache,SCSICmds,BlockSize,DiskGeometry,MaxNumSegs>
>>> | virtio_pci4: negotiated features: 0x254
>>> <WriteCache,BlockSize,DiskGeometry,MaxNumSegs>
>>> | atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
>>> | atkbd0: <AT Keyboard> irq 1 on atkbdc0
>>> | kbd0 at atkbd0
>>> | ugen0.1: <Intel> at usbus0
>>> | uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on
>>> usbus0
>>> | psm0: <PS/2 Mouse> irq 12 on atkbdc0
>>> | psm0: model IntelliMouse Explorer, device ID 4
>>> | cpu0: <ACPI CPU> on acpi0
>>> | cpu_cst0: <ACPI CPU C-State> on cpu0
>>> | cpu1: <ACPI CPU> on acpi0
>>> | cpu_cst1: <ACPI CPU C-State> on cpu1
>>> | ACPI: Enabled 16 GPEs in block 00 to 0F
>>> | orm0: <ISA Option ROM> at iomem 0xe9800-0xeffff on isa0
>>> | pmtimer0 on isa0
>>> | vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on
>>> isa0
>>> | sc0: <System console> at flags 0x100 on isa0
>>> | sc0: VGA <16 virtual consoles, flags=0x300>
>>> | sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
>>> | sio0: type 16550A
>>> | sio1: can't drain, serial port might not exist, disabling
>>> | hpt27xx: no controller detected.
>>> | CAM: Configuring 2 busses
>>> | CAM: finished configuring all busses
>>> | cd0 at ata1 bus 0 target 0 lun 0
>>> | cd0: <QEMU QEMU DVD-ROM 1.0> Removable CD-ROM SCSI-0 device
>>> | cd0: 16.000MB/s transfers
>>> | cd0: cd present [329728 x 2048 byte records]
>>> | uhub0: 2 ports with 2 removable, self powered
>>> | ugen0.2: <QEMU> at usbus0
>>> | uhid0: <QEMU QEMU USB Tablet, class 0/0, rev 1.00/0.00, addr 2> on
>>> usbus0
>>> | no B_DEVMAGIC (bootdev=0)
>>> | Device Mapper version 4.16.0 loaded
>>> | dm_target_zero: Successfully initialized
>>> | dm_target_crypt: Successfully initialized
>>> | dm_target_error: Successfully initialized
>>> | Mounting root from ufs:md0s0
>>> | DMA space used: 1236k, remaining available: 131072k
>>> | Mounting devfs
>>> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
>>> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
>>> | HAMMER(Rhaal) recovery check seqno=055a4f51
>>> | HAMMER(Rhaal) recovery range 300000000cc2da60-300000000cc2da60
>>> | HAMMER(Rhaal) recovery nexto 300000000cc2da60 endseqno=055a4f52
>>> | HAMMER(Rhaal) mounted clean, no recovery needed
>>> | chroot_kernel: set new rootnch/rootvnode to /new_root
>>> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
>>> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
>>> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
>>> | HAMMER: read-only -> read-write
>>> | HAMMER(Rhaal-Daten) recovery check seqno=352f5bc4
>>> | HAMMER(Rhaal-Daten) recovery range 3000000000d5c108-3000000000d77bc8
>>> | HAMMER(Rhaal-Daten) recovery nexto 3000000000d77bc8 endseqno=352f5cc1
>>> | HAMMER(Rhaal-Daten) recovery undo  3000000000d5c108-3000000000d77bc8
>>> (113344 bytes)(RW)
>>> | HAMMER(Rhaal-Daten) Found REDO_SYNC 3000000000cb87a0
>>> | HAMMER(Rhaal-Daten) recovery complete
>>> | HAMMER(Rhaal-Daten) recovery redo  3000000000d5c108-3000000000d77bc8
>>> (113344 bytes)(RW)
>>> | HAMMER(Rhaal-Daten) Find extended redo  3000000000cb87a0, 670056
>>> extbytes
>>> | HAMMER(Rhaal-Daten) End redo recovery
>>> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
>>> | swap low/high-water marks set to 83874/125811
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20160527/13c5d1fb/attachment-0004.html>


More information about the Users mailing list