Re: Random server crashes every few weeks (smp_invltlb: endless loop […] retrysmp_invltlb: ipi sent)

Steve Petrie, P.Eng. apetrie at aspetrie.net
Fri May 27 07:39:31 PDT 2016


Greetings To DragonFlyBSD List,

The subject of random server crashes with DragonFly running running on a 
virtualized host machine, is of great interest (and concern) to me. 
Caveat: I am an (almost) complete DragonFly newbie.

Please see my commens inline below.

Steve

----- Original Message ----- 
From: "Stefan Unterweger" <232.20711 at chiffre.aleturo.com>
To: "Matthew Dillon" <dillon at backplane.com>
Cc: <users at dragonflybsd.org>
Sent: Friday, May 27, 2016 3:38 AM
Subject: Re: Random server crashes every few weeks (smp_invltlb: endless 
loop […] retrysmp_invltlb: ipi sent)


>* Matthew Dillon on Thu, May 26, 2016 at 11:00:18AM -0700:
>> It's really hard to say from something which is virtually hosted.  It 
>> kinda
>> sounds like the virtual host isn't assigning enough of its own cpus 
>> to the
>> virtual host.  The fact that DragonFly is complaining about 
>> smp_invltlb()
>> implies that the host's virtualized cpu threads are not getting 
>> scheduled
>> properly.
>>
>> One thing to note is that we do not do any instruction escapes to 
>> hint to
>> virtual hosts when a cpu is in a tight loop waiting for 
>> synchronization.
>> It would be nice if we had some support for that, it would probably 
>> make
>> DFly play better on virtualized systems.
>
> This is an interesting suggestion, which at least would explain at 
> least
> some of the cases where I’ve experienced the crashes (the daily HAMMER
> cronjob, heavy paging under stress, I/O bursts and so on).
>
> So in effect, could it be that the crashes are more likely as either 
> my
> own server comes under load or some -other- server who happens to run 
> in
> the same hypervisor?
>
> Would this warrant opening a ticket with Profitbricks, or is it just 
> as
> likely that I’m wasting my time and will only get a response along the
> lines of ‘Use Linux; Dragonfly BSD most certainly is not supported’?
>
>> I suggest setting the number of cores to 1.  That will get rid of all 
>> SMP
>> interplay and hopefully remove the issues the virtual host is choking 
>> on.
>
> Interestingly enough, I have seen the opposite so far.  At first, I 
> have
> run the server on only one core, to save money and because it doesn’t
> really yet need any more.  When on one core, it still freezes, along
> approximately the same pattern, but I never got a trace there.
>
> My guess then was that perhaps there would have been some odd race
> condition between paging, HAMMER and dm_crypt—adding another core
> temporarily seemed more stable and then regressed back to the mean.
>
> I will try to set up another VM to see whether I can reliably 
> reproduce
> such a crash.
>

After a great deal of research, I chose DragonFly as the OS for a new 
website (not yet online). Three main attributes drew me to DragonFly: 1. 
reputation for reliability and speed, 2. hammer file system, 3. 
responsiveness of DragonFly open source community.

However, my business plan for this new website (not yet online), 
requires starting out hosting it on a VM under QEMU / KVM 
virtualization, because I cannot justify the much higher cost of 
dedicated server hosting hardware. And I like the brutally competitive 
quasi-commoditized hosting services market for QEMU / KVM virtualization 
offerings.

I do have an experimental working DragonFly installation on a QEMU / KVM 
VM hosted at Elastic Hosts www.elastichosts.com I access this VM through 
TightVNC and I get a DragonFly console using PuTTY.

But I had to suspend work on testing this DragonFly VM installation, due 
to other business priorities. I hope to get back to it later in 2016.

However, I can highly recommend Elastic Hosts for their solid cloud 
infrastructure and their strong customer support.

So if Stefan wants to expand his testing of the DragonFly crash to a VM 
with a (probably) different underlying architecture than at 
ProfitBricks, I would recommend giving an Elastic Hosts QEMU / KVM VM a 
try.

Alternatively, if Stafan has (or develops) some simple limited 
self-contained testing setup, for reproducing the DragonFly crash he is 
experiencing with the ProfitBricks VM he's presently using, I would be 
interested to try to set up the same testing scenario on my current 
DragonFly (and later on an upgraded version of DragonFly) on the Elastic 
Hosts VM where I presently have (an outdated version of) DragonFly 
operational.

Steve

>
> Thanks for your answer,
>  Stefan
>
>
>
> PS: Just in case, as I’ve forgotten it previously: here’s the dmesg 
> from
>    the server in question.
>
> | Copyright (c) 2003-2015 The DragonFly Project.
> | Copyright (c) 1992-2003 The FreeBSD Project.
> | Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 
> 1994
> | The Regents of the University of California. All rights reserved.
> | DragonFly v4.4.1-RELEASE #2: Sun Dec  6 19:10:59 EST 2015
> | 
> root at www.shiningsilence.com:/usr/obj/home/justin/release/4_4/sys/X86_64_GENERIC
> | TSC clock: 2600054420 Hz, i8254 clock: 1193169 Hz
> | CPU: AMD Opteron 62xx class CPU (2600.11-MHz K8-class CPU)
> |   Origin = "AuthenticAMD"  Id = 0x600f12  Stepping = 2
> | 
> Features=0x783fbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE,SSE2>
> | 
> Features2=0x96982203<SSE3,PCLMULQDQ,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AESNI,XSAVE,AVX,VMM>
> |   AMD Features=0x24500800<SYSCALL,NX,MMX+,Page1GB,LM>
> |   AMD 
> Features2=0x10be7<LAHF,CMP,SVM,ABM,SSE4A,MAS,Prefetch,OSVW,XOP,FMA4>
> |   MONITOR/MWAIT Features=0x2<INTBRK>
> | real memory  = 3219762176 (3070 MB)
> | avail memory = 2990727168 (2852 MB)
> | lapic: divisor index 0, frequency 500005713 Hz
> | SMI Frequency (worst case): 28571 Hz (35 us)
> | Initialize MI interrupts
> | wdog: In-kernel automatic watchdog reset enabled
> | kbd1 at kbdmux0
> | md0: Preloaded image <initrd.img> 15728640 bytes at 
> 0xffffffff82739ac0
> | md1: Malloc disk
> | ACPI: RSDP 0x00000000000FC980 000014 (v00 BOCHS )
> | ACPI: RSDT 0x00000000BFFFBCA0 000040 (v01 BOCHS  BXPCRSDT 00000001 
> BXPC 00000001)
> | ACPI: FACP 0x00000000BFFFFF80 000074 (v01 BOCHS  BXPCFACP 00000001 
> BXPC 00000001)
> | ACPI: DSDT 0x00000000BFFFBCE0 00151D (v01 BXPC   BXDSDT   00000001 
> INTL 20100528)
> | ACPI: FACS 0x00000000BFFFFF40 000040
> | ACPI: APIC 0x00000000BFFFFC60 000270 (v01 BOCHS  BXPCAPIC 00000001 
> BXPC 00000001)
> | ACPI: HPET 0x00000000BFFFFC20 000038 (v01 BOCHS  BXPCHPET 00000001 
> BXPC 00000001)
> | ACPI: SRAT 0x00000000BFFFF770 0004A8 (v01 BOCHS  BXPCSRAT 00000001 
> BXPC 00000001)
> | ACPI: SSDT 0x00000000BFFFD8E0 001E8E (v01 BOCHS  BXPCSSDT 00000001 
> BXPC 00000001)
> | ACPI: SSDT 0x00000000BFFFD870 00003D (v01 BOCHS  BXPCSSDT 00000001 
> BXPC 00000001)
> | ACPI: SSDT 0x00000000BFFFD200 00066E (v01 BXPC   BXSSDTPC 00000001 
> INTL 20100528)
> | cryptosoft0: <software crypto> on motherboard
> | aesni0: <AES-CBC,AES-XTS> on motherboard
> | padlock0: No ACE support.
> | rdrand0: No RdRand support.
> | acpi0: <BOCHS BXPCRSDT> on motherboard
> | ACPI: 4 ACPI AML tables successfully acquired and loaded
> | ACPI FADT: SCI testing interrupt mode ...
> | ACPI FADT: SCI select level/low
> | objcache_reclaimlist
> | objcache_reclaimlist
> | objcache_reclaimlist
> | objcache_reclaimlist
> | acpi0: Power Button (fixed)
> | acpi_timer0 on acpi0
> | acpi_hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff 
> on acpi0
> | acpi_hpet0: frequency 100000000
> | pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
> | pci0: <ACPI PCI bus> on pcib0
> | pci_link4: Unable to route IRQs: AE_NOT_FOUND
> | isab0: <PCI-ISA bridge> at device 1.0 on pci0
> | isa0: <ISA bus> on isab0
> | atapci0: <Intel PIIX3 WDMA2 controller> port 
> 0xc120-0xc12f,0x376,0x170-0x177,0x3f6,0x1f0-0x1f7 at device 1.1 on 
> pci0
> | ata0: <ATA channel 0> on atapci0
> | ata1: <ATA channel 1> on atapci0
> | acd0: DVDROM <QEMU DVD-ROM/1.0> at ata1-master WDMA2
> | uhci0: <Intel 82371SB (PIIX3) USB controller> port 0xc0c0-0xc0df irq 
> 11 at device 1.2 on pci0
> | usbus0: controller did not stop
> | usbus0 on uhci0
> | pci0: <bridge> (vendor 0x8086, dev 0x7113) at device 1.3 irq 9
> | vgapci0: <VGA-compatible display> mem 0xfd000000-0xfdffffff at 
> device 2.0 on pci0
> | vgapci0: Boot video device
> | virtio_pci0: <VirtIO PCI Balloon adapter> port 0xc0e0-0xc0ff irq 11 
> at device 3.0 on pci0
> | virtio_pci1: <VirtIO PCI Block adapter> port 0xc000-0xc03f mem 
> 0xfebf0000-0xfebf0fff irq 10 at device 5.0 on pci0
> | vtblk0: <VirtIO Block Adapter> on virtio_pci1
> | virtio_pci1: host features: 0x710006d4 
> <EventIdx,RingIndirect,NotifyOnEmpty,Topology,WriteCache,SCSICmds,BlockSize,DiskGeometry,MaxNumSegs>
> | virtio_pci1: negotiated features: 0x254 
> <WriteCache,BlockSize,DiskGeometry,MaxNumSegs>
> | virtio_pci2: <VirtIO PCI Block adapter> port 0xc040-0xc07f mem 
> 0xfebf1000-0xfebf1fff irq 10 at device 6.0 on pci0
> | vtblk1: <VirtIO Block Adapter> on virtio_pci2
> | virtio_pci2: host features: 0x710006d4 
> <EventIdx,RingIndirect,NotifyOnEmpty,Topology,WriteCache,SCSICmds,BlockSize,DiskGeometry,MaxNumSegs>
> | virtio_pci2: negotiated features: 0x254 
> <WriteCache,BlockSize,DiskGeometry,MaxNumSegs>
> | virtio_pci3: <VirtIO PCI Network adapter> port 0xc100-0xc11f mem 
> 0xfebf2000-0xfebf2fff irq 11 at device 7.0 on pci0
> | vtnet0: <VirtIO Networking Adapter> on virtio_pci3
> | virtio_pci3: host features: 0x711f8060 
> <EventIdx,RingIndirect,NotifyOnEmpty,RxModeExtra,VLanFilter,RxMode,ControlVq,Status,MrgRxBuf,TxAllGSO,MacAddress>
> | virtio_pci3: negotiated features: 0x110f8020 
> <RingIndirect,NotifyOnEmpty,VLanFilter,RxMode,ControlVq,Status,MrgRxBuf,MacAddress>
> | usbus0: 12Mbps Full Speed USB v1.0
> | vtnet0: MAC address: 02:01:06:f6:1b:63
> | add dynamic link state
> | virtio_pci4: <VirtIO PCI Block adapter> port 0xc080-0xc0bf mem 
> 0xfebf3000-0xfebf3fff irq 11 at device 8.0 on pci0
> | vtblk2: <VirtIO Block Adapter> on virtio_pci4
> | virtio_pci4: host features: 0x710006d4 
> <EventIdx,RingIndirect,NotifyOnEmpty,Topology,WriteCache,SCSICmds,BlockSize,DiskGeometry,MaxNumSegs>
> | virtio_pci4: negotiated features: 0x254 
> <WriteCache,BlockSize,DiskGeometry,MaxNumSegs>
> | atkbdc0: <Keyboard controller (i8042)> port 0x64,0x60 irq 1 on acpi0
> | atkbd0: <AT Keyboard> irq 1 on atkbdc0
> | kbd0 at atkbd0
> | ugen0.1: <Intel> at usbus0
> | uhub0: <Intel UHCI root HUB, class 9/0, rev 1.00/1.00, addr 1> on 
> usbus0
> | psm0: <PS/2 Mouse> irq 12 on atkbdc0
> | psm0: model IntelliMouse Explorer, device ID 4
> | cpu0: <ACPI CPU> on acpi0
> | cpu_cst0: <ACPI CPU C-State> on cpu0
> | cpu1: <ACPI CPU> on acpi0
> | cpu_cst1: <ACPI CPU C-State> on cpu1
> | ACPI: Enabled 16 GPEs in block 00 to 0F
> | orm0: <ISA Option ROM> at iomem 0xe9800-0xeffff on isa0
> | pmtimer0 on isa0
> | vga0: <Generic ISA VGA> at port 0x3c0-0x3df iomem 0xa0000-0xbffff on 
> isa0
> | sc0: <System console> at flags 0x100 on isa0
> | sc0: VGA <16 virtual consoles, flags=0x300>
> | sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
> | sio0: type 16550A
> | sio1: can't drain, serial port might not exist, disabling
> | hpt27xx: no controller detected.
> | CAM: Configuring 2 busses
> | CAM: finished configuring all busses
> | cd0 at ata1 bus 0 target 0 lun 0
> | cd0: <QEMU QEMU DVD-ROM 1.0> Removable CD-ROM SCSI-0 device
> | cd0: 16.000MB/s transfers
> | cd0: cd present [329728 x 2048 byte records]
> | uhub0: 2 ports with 2 removable, self powered
> | ugen0.2: <QEMU> at usbus0
> | uhid0: <QEMU QEMU USB Tablet, class 0/0, rev 1.00/0.00, addr 2> on 
> usbus0
> | no B_DEVMAGIC (bootdev=0)
> | Device Mapper version 4.16.0 loaded
> | dm_target_zero: Successfully initialized
> | dm_target_crypt: Successfully initialized
> | dm_target_error: Successfully initialized
> | Mounting root from ufs:md0s0
> | DMA space used: 1236k, remaining available: 131072k
> | Mounting devfs
> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
> | HAMMER(Rhaal) recovery check seqno=055a4f51
> | HAMMER(Rhaal) recovery range 300000000cc2da60-300000000cc2da60
> | HAMMER(Rhaal) recovery nexto 300000000cc2da60 endseqno=055a4f52
> | HAMMER(Rhaal) mounted clean, no recovery needed
> | chroot_kernel: set new rootnch/rootvnode to /new_root
> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
> | HAMMER: read-only -> read-write
> | HAMMER(Rhaal-Daten) recovery check seqno=352f5bc4
> | HAMMER(Rhaal-Daten) recovery range 3000000000d5c108-3000000000d77bc8
> | HAMMER(Rhaal-Daten) recovery nexto 3000000000d77bc8 
> endseqno=352f5cc1
> | HAMMER(Rhaal-Daten) recovery undo  3000000000d5c108-3000000000d77bc8 
> (113344 bytes)(RW)
> | HAMMER(Rhaal-Daten) Found REDO_SYNC 3000000000cb87a0
> | HAMMER(Rhaal-Daten) recovery complete
> | HAMMER(Rhaal-Daten) recovery redo  3000000000d5c108-3000000000d77bc8 
> (113344 bytes)(RW)
> | HAMMER(Rhaal-Daten) Find extended redo  3000000000cb87a0, 670056 
> extbytes
> | HAMMER(Rhaal-Daten) End redo recovery
> | dm_target_crypt: Setting min/max mpipe buffers: 2/30
> | swap low/high-water marks set to 83874/125811 




More information about the Users mailing list