Sun Jun 27 17:59:49 PDT 2021

Issue #3286 has been updated by mwiencek.

dillon wrote:
> Yah, you definitely found an initialized return value.  If the garbage in the field happened to be 0, it would try to insert an uninitialized knote, causing the panic.
> Now the question is what should the proper error code be... not -1 :-).  I am guessing EINVAL.  Try changing your assignments from -1 to EINVAL and tell me if sound still works for you.
> -Matt

Confirmed that sound still works using EINVAL instead of -1.  Thanks for the hint...I'm a newbie at this. :)

Bug #3286: page fault in dsp_kqfilter / knote_insert using snd_hda

* Author: mwiencek
* Status: New
* Priority: High
* Assignee: dillon
* Category: 
* Target version: master
I installed DragonFly on a motherboard from 2018 (GIGABYTE B450 AORUS PRO WIFI) and tested the sound to see if it worked.  (Two HDA devices are detected, but one is from the GPU and I was only testing the output of the integrated one, hdac1.)

@pciconf -lv@, truncated
hdac0 at pci0:6:0:1:	class=0x040300 card=0xaaf01849 chip=0xaaf01002 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD/ATI]'
    device     = 'Ellesmere HDMI Audio [Radeon RX 470/480 / 570/580/590]'
    class      = multimedia
    subclass   = HDA
hdac1 at pci0:8:0:3:	class=0x040300 card=0xa0c31458 chip=0x14571022 rev=0x00 hdr=0x00
    vendor     = 'Advanced Micro Devices, Inc. [AMD]'
    device     = 'Family 17h (Models 00h-0fh) HD Audio Controller'
    class      = multimedia
    subclass   = HDA

@kldload snd_hda@
hdac0: <ATI (0xaaf0) HDA Controller> mem 0xfcf60000-0xfcf63fff irq 55 at device 0.1 on pci6
hdac0: link ctrl 0x2930
hdac0: disable nosnoop
hdac1: <AMD (0x1457) HDA Controller> mem 0xfce00000-0xfce07fff irq 43 at device 0.3 on pci8
hdac1: link ctrl 0x2830
hdac1: disable nosnoop
hdacc0: <ATI R6xx HDA CODEC> at cad 0 on hdac0
hdaa0: <ATI R6xx Audio Function Group> at nid 1 on hdacc0
pcm0: <ATI R6xx (HDMI)> at nid 3 on hdaa0
pcm1: <ATI R6xx (HDMI)> at nid 5 on hdaa0
pcm2: <ATI R6xx (HDMI)> at nid 7 on hdaa0
pcm3: <ATI R6xx (HDMI)> at nid 9 on hdaa0
pcm4: <ATI R6xx (HDMI)> at nid 11 on hdaa0
pcm5: <ATI R6xx (HDMI)> at nid 13 on hdaa0
hdacc1: <Realtek (0x1220) HDA CODEC> at cad 0 on hdac1
hdaa1: <Realtek (0x1220) Audio Function Group> at nid 1 on hdacc1
pcm6: <Realtek (0x1220) (Rear Analog 5.1/2.0)> at nid 20,22,21 and 24,26 on hdaa1
pcm7: <Realtek (0x1220) (Front Analog)> at nid 27 and 25 on hdaa1
pcm8: <Realtek (0x1220) (Rear Digital)> at nid 30 on hdaa1

Playing a random mp3 file via mp3blaster works out of the box, though a page fault consistently occurs if I load youtube.com in Firefox and play any video.  The system immediately locks up after the video loads and sound is about to play, though luckily a core dump happens after a hard reboot.

Fatal user address access from kernel mode from firefox at ffffffff80453bcd

Fatal trap 12: page fault while in kernel mode
cpuid = 10; lapic id = 11
fault virtual address	= 0x98
fault code		= supervisor read data, page not present
instruction pointer	= 0x8:0xffffffff80453bcd
stack pointer	        = 0x10:0xfffff806ae85ee08
frame pointer	        = 0x10:0xfffff806ae85ee18
code segment		= base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags	= interrupt enabled, resume, IOPL = 0
current process		= 915
current thread          = pri 6 
trap number		= 12
panic: page fault
cpuid = 10
Trace beginning at frame 0xfffff806ae85eb58
panic() at panic+0x2ea 0xffffffff8047ff83 
panic() at panic+0x2ea 0xffffffff8047ff83 
trap_fatal() at trap_fatal+0x529 0xffffffff808e6c33 
trap_pfault() at trap_pfault+0x1d5 0xffffffff808e63ea 
trap() at trap+0x6af 0xffffffff808e5d49 
calltrap() at calltrap+0x9 0xffffffff808a557a 
--- trap 000000000000000c, rip = ffffffff80453bcd, rsp = fffff806ae85edf0, rbp = fffff806ae85ee18 ---
knote_insert() at knote_insert+0x63 0xffffffff80453bcd

I've attached a patch that allowed me to work around the issue.  Sound even plays from the videos once it's applied, though I'm not sure what the underlying issue is.  @dsp_kqfilter@ is receiving @EVFILT_READ@ events, but for some reason @rdch@ is NULL and @wrch@ is not, and that leaves @bs@ uninitialized so garbage is assigned to @klist@ (always 0x98).  My kgdb session is attached too.  If there's any other debug info I can provide, let me know -- I tried adding kprintfs to understand what was going on, but got stuck.

dsp.c.patch (579 Bytes)
kgdb-session (17 KB)

