Fatal trap 19: non-maskable interrupt trap while in kernel mode
Hidetoshi Shimokawa
simokawa at FreeBSD.org
Sun Nov 7 18:04:11 PST 2004
At Sun, 7 Nov 2004 12:21:59 -0800 (PST),
Matthew Dillon wrote:
>
>
> :During the boot, interrputs are disabled and this shouldn't be a problem.
> :And the interrupt vector is already initialized in fwohci_pci_attach()
> :before fwochi_init() is called.
> :It's a good idea that we should mask IT/IR interrupt before probing channels
> :but it should have nothing to do with this problem.
> :
> :As interrupts are disable during the boot, it must not be a usual interrupt
> :but a NMI. I think it's a PCI bus problem rather than RAM.
> :
> :Try the following patch,
> :
> :Index: fwohci_pci.c
> :===================================================================
> :RCS file: /home/dcvs/src/sys/bus/firewire/fwohci_pci.c,v
> :retrieving revision 1.15
> :diff -u -r1.15 fwohci_pci.c
> :--- fwohci_pci.c 18 Jul 2004 12:37:03 -0000 1.15
> :+++ fwohci_pci.c 7 Nov 2004 19:47:52 -0000
> :@@ -238,6 +238,7 @@
> : PCIM_CMD_SERRESPEN | PCIM_CMD_PERRESPEN;
> : #if 1
> : cmd &= ~PCIM_CMD_MWRICEN;
> :+ cmd &= ~(PCIM_CMD_SERRESPEN | PCIM_CMD_PERRESPEN);
> : #endif
> : pci_write_config(self, PCIR_COMMAND, cmd, 2);
> :...
> :
> :I suppose their code doesn't enable above flags.
> :...
> :/\ Hidetoshi Shimokawa
> :\/ simokawa at xxxxxxxxxxx
>
> I think you've found it. All the OpenBSD code does is enable the bus
> master bit. It doesn't touch any of the other bits.
>
> The original FreeBSD commit associated with this issue is:
>
> >revision 1.20
> >date: 2003/03/24 03:47:36; author: simokawa; state: Exp; lines: +6 -2
> >Safe PCI configuration.
> >- Clear PCIM_CMD_MWRICEN:
> > some chips seem to have problem with write invalidate.
> > clearing this bit fixes SBP timeout problem.
> >
> >Tested by: Michael Reifenberger <Michael.Reifenberger at xxxxxxxx>
> >
> >- Set PCIM_CMD_SERRESPEN and PCIM_CMD_PERRESPEN
> >- Moderate value for latency timer.
>
> He doesn't explain *WHY* he is turning on SERRESPEN and PERRESPEN.
> Generally, however, any device with its own on-board memory (as these
> devices have) is subject to parity errors on the PCI bus if that
> memory is not completely cleared on boot. And that is what could be
> happening here.
Hmm, PERR and SERR indicates PCI bus parity errors and other fatal errors.
I added it to detect broken hardwares. This is the first report of the error
I have ever got.
Are you sure it has something to do with clearing on-chip memory?
Do you know how to clear them?
> Note that in his commit message he had to turn off write-invalidate.
> That's a sure sign of on-chip parity checked memory not being initialized.
I thought PERR/SERR is independent of write-invalidate. Could you
explain more?
> I will role another ISO with the change and post when it's ready.
Thanks,
/\ Hidetoshi Shimokawa
\/ simokawa at xxxxxxxxxxx
More information about the Kernel
mailing list