DP performance

Thu Dec 1 13:53:08 PST 2005

:...
:wall a lot sooner with MP than with UP, because
:you have to get back to the ring, no matter what
:the intervals are, before they wrap. As you
:increase the intervals (and thus decrease the
:ints/second) you'll lose even more packets,
:because there is less space in the ring when the
:interrupt is generated and less time for the cpu
:to get to it. 
:
:Flow control isn't like XON/OFF where we say "hey
:our buffer is almost full so lets send some flow
:control" at 9600 baud. By the time you're flow
:controlling you've already lost enough packets to
:piss off your customer base. Plus flow
:controlling a big switch will just result in the
:switch dropping the packets instead of you, so
:what have you really gained?
:
:Packet loss is real, no matter how much you deny
:it. If you don't believe it, then you need a
:better traffic generator.
:
:DT

    Now you need to think carefully about what you are actually arguing
    about over here.   You are making generalizations that are an 
    incorrect interpretation of what features such as interrupt moderation
    and flow control are intended to provide.

    What I have said, several times now, is that a reasonably modern cpu
    is NO LONGER the bottleneck for routing packets.  In otherwords, there 
    is going to be *plenty* of cpu suds available. The problem isn't cpu
    suds, its the latency after new data becomes available before the cpu
    is able to clear the RX ring.

    But, and this is where you are misinterpreting the features, the problem
    is that there is ANY latency, it is simply that there is too MUCH latency
    *SOMETIMES*.  The whole point of having a receive ring and flow control
    is to INCREASE the amount of latency that can be tolerated before packets
    are lost.  

    This in turn gives the operating system a far better ability to manage
    its processing latencies.  All the operating system has to guarentee
    to avoid losing packets is that latency does not exceed a certain 
    calculated value, NOT That the latency has to be minimized.  There is 
    a big difference between those two concepts.

    Lets take an example.  Including required on-the-wire PAD I think the
    minimum packet size is around 64-128 bytes.  I'd have to look up the
    standard to know for sure (but its 64 bytes worth on 100BaseT, and
    probably something similar for GigE).  So lets just say its 64 bytes.
    That is approximately 576 to 650 bits on the wire.  Lets say 576 bits.
    Now lets say you have a 256 entry receive ring and your interrupt 
    moderation is set so you get around 12 packets per interrupt.  So your
    effective receive ring is 256 - 12 or 244 entries for 576 bits per
    entry if minimally sized packets are being routed.  That's around 140,000
    bits, or 140 uS.

    So in such a configuration routing minimally sized packets the
    operating system must respond to an interrupt within 140 uS to avoid
    flow control being activated.

    If we take a more likely scenario... packets with an average size of,
    say, 256 bytes (remember that a TCP/IP header is 40 bytes just in itself),
    you wind up with around 550,000 bits to fill the receive ring or
    a required interrupt latency of no more then 550 uS.

    550 uS is a very long time.  Even 140 uS is quite a long time (keep in
    mind that most system calls takes less then 5 uS to execute, and many
    takes less then 1).  Most interrupt service routines take only a few
    microseconds to execute.  Even clearing 200 entries out of a receive
    ring would not take more then 10-15 uS.  So 140 uS is likely to be
    achievable WITHOUT having to resort to real time scheduling or other
    methods.

    If you have a bunch of interfaces all having to clear nearly whole
    rings (200+ entries) then it can start to get a little iffy, but even
    there it would take quite a few interfaces to saturate even a single
    cpu, let alone multiple cpus in an MP setup.

    The key thing to remember here is that the goal here NOT to minimize
    interrupt latency, but instead to simply guarentee that interrupt
    processing latency does not exceed the ring calculation.  That's the
    ONLY thing we care about.  

    While it is true in one sense that minimizing interrupt latency gives
    you a bit more margin, the problem with that sort of reasoning is that
    the cost of minimizing interrupt latency is often to be far less 
    cpu-efficient, which means you actually wind up being able to handle
    FEWER network interfaces instead of the greater number of network
    interfaces you thought you'd be able to handle.

    I can think of a number of polling schemes that would be able to 
    improve overall throughput in a dedicated routing environment, but
    they would only be required in the most extreme of situations where
    you are already knocking up against the processing capabilities of your
    cpu.  And as I have said (many times), it takes a lot to actuall 
    saturate a modern cpu.

						-Matt