DP performance

Thu Dec 1 18:48:56 PST 2005

--- Matthew Dillon <dillon at xxxxxxxxxxxxxxxxxxxx>
wrote:

> :...
> :wall a lot sooner with MP than with UP,
> because
> :you have to get back to the ring, no matter
> what
> :the intervals are, before they wrap. As you
> :increase the intervals (and thus decrease the
> :ints/second) you'll lose even more packets,
> :because there is less space in the ring when
> the
> :interrupt is generated and less time for the
> cpu
> :to get to it. 
> :
> :Flow control isn't like XON/OFF where we say
> "hey
> :our buffer is almost full so lets send some
> flow
> :control" at 9600 baud. By the time you're flow
> :controlling you've already lost enough packets
> to
> :piss off your customer base. Plus flow
> :controlling a big switch will just result in
> the
> :switch dropping the packets instead of you, so
> :what have you really gained?
> :
> :Packet loss is real, no matter how much you
> deny
> :it. If you don't believe it, then you need a
> :better traffic generator.
> :
> :DT
> 
>     Now you need to think carefully about what
> you are actually arguing
>     about over here.   You are making
> generalizations that are an 
>     incorrect interpretation of what features
> such as interrupt moderation
>     and flow control are intended to provide.
> 
>     What I have said, several times now, is
> that a reasonably modern cpu
>     is NO LONGER the bottleneck for routing
> packets.  In otherwords, there 
>     is going to be *plenty* of cpu suds
> available. The problem isn't cpu
>     suds, its the latency after new data
> becomes available before the cpu
>     is able to clear the RX ring.
> 
>     But, and this is where you are
> misinterpreting the features, the problem
>     is that there is ANY latency, it is simply
> that there is too MUCH latency
>     *SOMETIMES*.  The whole point of having a
> receive ring and flow control
>     is to INCREASE the amount of latency that
> can be tolerated before packets
>     are lost.  
> 
>     This in turn gives the operating system a
> far better ability to manage
>     its processing latencies.  All the
> operating system has to guarentee
>     to avoid losing packets is that latency
> does not exceed a certain 
>     calculated value, NOT That the latency has
> to be minimized.  There is 
>     a big difference between those two
> concepts.
> 
>     Lets take an example.  Including required
> on-the-wire PAD I think the
>     minimum packet size is around 64-128 bytes.
>  I'd have to look up the
>     standard to know for sure (but its 64 bytes
> worth on 100BaseT, and
>     probably something similar for GigE).  So
> lets just say its 64 bytes.
>     That is approximately 576 to 650 bits on
> the wire.  Lets say 576 bits.
>     Now lets say you have a 256 entry receive
> ring and your interrupt 
>     moderation is set so you get around 12
> packets per interrupt.  So your
>     effective receive ring is 256 - 12 or 244
> entries for 576 bits per
>     entry if minimally sized packets are being
> routed.  That's around 140,000
>     bits, or 140 uS.
> 
>     So in such a configuration routing
> minimally sized packets the
>     operating system must respond to an
> interrupt within 140 uS to avoid
>     flow control being activated.
> 
>     If we take a more likely scenario...
> packets with an average size of,
>     say, 256 bytes (remember that a TCP/IP
> header is 40 bytes just in itself),
>     you wind up with around 550,000 bits to
> fill the receive ring or
>     a required interrupt latency of no more
> then 550 uS.
> 
>     550 uS is a very long time.  Even 140 uS is
> quite a long time (keep in
>     mind that most system calls takes less then
> 5 uS to execute, and many
>     takes less then 1).  Most interrupt service
> routines take only a few
>     microseconds to execute.  Even clearing 200
> entries out of a receive
>     ring would not take more then 10-15 uS.  So
> 140 uS is likely to be
>     achievable WITHOUT having to resort to real
> time scheduling or other
>     methods.
> 
>     If you have a bunch of interfaces all
> having to clear nearly whole
>     rings (200+ entries) then it can start to
> get a little iffy, but even
>     there it would take quite a few interfaces
> to saturate even a single
>     cpu, let alone multiple cpus in an MP
> setup.
> 
>     The key thing to remember here is that the
> goal here NOT to minimize
>     interrupt latency, but instead to simply
> guarentee that interrupt
>     processing latency does not exceed the ring
> calculation.  That's the
>     ONLY thing we care about.  
> 
>     While it is true in one sense that
> minimizing interrupt latency gives
>     you a bit more margin, the problem with
> that sort of reasoning is that
>     the cost of minimizing interrupt latency is
> often to be far less 
>     cpu-efficient, which means you actually
> wind up being able to handle
>     FEWER network interfaces instead of the
> greater number of network
>     interfaces you thought you'd be able to
> handle.
> 
>     I can think of a number of polling schemes
> that would be able to 
>     improve overall throughput in a dedicated
> routing environment, but
>     they would only be required in the most
> extreme of situations where
>     you are already knocking up against the
> processing capabilities of your
>     cpu.  And as I have said (many times), it
> takes a lot to actuall 
>     saturate a modern cpu.

You obviously have forgotten the original premise
of this (which is how do we get past the "wall"
of UP networking performance), and you also
obviously have no practical experience with
heavily utilized network devices, because you
seem to have no grasp on the real issues. 

Being smart is not about knowing everything; its
about recognizing when you don't and making an
effort to learn. I seem to remember you saying
that there was no performance advantage to PCI-X
as well not so long ago. Its really quite amazing
to me that you can continue to stick to arguments
that are so easily provable to be wrong. Perhaps
someday you'll trade in your slide rule and get
yourself a good test bed. College is over. Time
to enter reality, where the results are almost
never whole numbers.

DT

__________________________________________ 
Yahoo! DSL ? Something to write home about. 
Just $16.99/mo. or less. 
dsl.yahoo.com