Fatal trap 19: non-maskable interrupt trap while in kernel mode

Matthew Dillon dillon at apollo.backplane.com
Sun Nov 7 12:22:33 PST 2004


:During the boot, interrputs are disabled and this shouldn't be a problem.
:And the interrupt vector is already initialized in fwohci_pci_attach()
:before fwochi_init() is called.
:It's a good idea that we should mask IT/IR interrupt before probing channels
:but it should have nothing to do with this problem.
:
:As interrupts are disable during the boot, it must not be a usual interrupt
:but a NMI. I think it's a PCI bus problem rather than RAM.
:
:Try the following patch,
:
:Index: fwohci_pci.c
:===================================================================
:RCS file: /home/dcvs/src/sys/bus/firewire/fwohci_pci.c,v
:retrieving revision 1.15
:diff -u -r1.15 fwohci_pci.c
:--- fwohci_pci.c	18 Jul 2004 12:37:03 -0000	1.15
:+++ fwohci_pci.c	7 Nov 2004 19:47:52 -0000
:@@ -238,6 +238,7 @@
: 		PCIM_CMD_SERRESPEN | PCIM_CMD_PERRESPEN;
: #if 1
: 	cmd &= ~PCIM_CMD_MWRICEN; 
:+	cmd &= ~(PCIM_CMD_SERRESPEN | PCIM_CMD_PERRESPEN);
: #endif
: 	pci_write_config(self, PCIR_COMMAND, cmd, 2);
:...
:
:I suppose their code doesn't enable above flags.
:...
:/\ Hidetoshi Shimokawa
:\/  simokawa at xxxxxxxxxxx

    I think you've found it.  All the OpenBSD code does is enable the bus
    master bit.  It doesn't touch any of the other bits.

    The original FreeBSD commit associated with this issue is:

>revision 1.20
>date: 2003/03/24 03:47:36;  author: simokawa;  state: Exp;  lines: +6 -2
>Safe PCI configuration.
>- Clear PCIM_CMD_MWRICEN:
>        some chips seem to have problem with write invalidate.
>        clearing this bit fixes SBP timeout problem.
>
>Tested by: Michael Reifenberger <Michael.Reifenberger at xxxxxxxx>
>
>- Set PCIM_CMD_SERRESPEN and PCIM_CMD_PERRESPEN
>- Moderate value for latency timer.

    He doesn't explain *WHY* he is turning on SERRESPEN and PERRESPEN. 
    Generally, however, any device with its own on-board memory (as these
    devices have) is subject to parity errors on the PCI bus if that
    memory is not completely cleared on boot.  And that is what could be
    happening here.

    Note that in his commit message he had to turn off write-invalidate.
    That's a sure sign of on-chip parity checked memory not being initialized.

    I will role another ISO with the change and post when it's ready.

					-Matt
					Matthew Dillon 
					<dillon at xxxxxxxxxxxxx>





More information about the Kernel mailing list