nfe(4) for nVidia GigE

Sepherosa Ziehau sepherosa at gmail.com
Sat Aug 26 23:59:59 PDT 2006


On 8/27/06, Matthew Dillon <dillon at xxxxxxxxxxxxxxxxxxxx> wrote:
:Yeah, that's it!!  Thank you very much!!  I didn't figure out the real
:cause in the old version, but instead went to the sidetrack: adding a
:delay in nfe_encap() :-P
:
:Best Regards,
:sephe
    It would be nice if we could determine which fix was the one that
    fixed your MB.  Insofar as I can tell there are three possible
    causes for the watchdog timeouts.
    (1) The hardware races the setting of NFE_TX_VALID in the second ring
        buffer of a multi-buffer TX DMA.  That is, the hardware is actively
        transmitting a prior packet and the driver starts laying down the
        new packet, and the hardware starts trying to transmit the new
        packet before the driver can finish laying it down.  This is due
        to the driver improperly setting NFE_TX_VALID on the first ring
        buffer in the new packet before finishing setting up all the ring
        buffers.
        Your delay had the effect of allowing the hardware to finish up
        all the TX ring buffers and thus be quiescent when new packets
        get queued, avoiding the race.  Insofar as I can tell when you
        KICK the hardware it runs TX ring buffers until it sees one
        without NFE_TX_VALID set, then it goes quiescent until the next
        KICK.
        This is solved by the encap code fixes.
This one fixes my MB's watchdog timeout problem :-)

    (2) The hardware fails to generate a TX completion interrupt.  The
        watchdog comes along and decides to reset the interface.
        This is solved by the fixes in the watchdog code which first attempt
        to drain the TX ring and then KICK it again before giving up and
        resetting the interface (which doesn't solve the problem anyhow, it
        appears).  The KICK seemed to get TX completion interrupts working
        again.
Do you mean after the KICK in watchdog handler, normal TX intr
behaviour restores?  mmm, IMHO, that means our TX descs are setup
properly, but some unknown registers are not setup properly, or it may
be a hardware bug :P
Best Regards,
sephe
--
Live Free or Die




More information about the Submit mailing list