dillon at apollo.backplane.com
Thu Dec 1 13:53:08 PST 2005
:wall a lot sooner with MP than with UP, because
:you have to get back to the ring, no matter what
:the intervals are, before they wrap. As you
:increase the intervals (and thus decrease the
:ints/second) you'll lose even more packets,
:because there is less space in the ring when the
:interrupt is generated and less time for the cpu
:to get to it.
:Flow control isn't like XON/OFF where we say "hey
:our buffer is almost full so lets send some flow
:control" at 9600 baud. By the time you're flow
:controlling you've already lost enough packets to
:piss off your customer base. Plus flow
:controlling a big switch will just result in the
:switch dropping the packets instead of you, so
:what have you really gained?
:Packet loss is real, no matter how much you deny
:it. If you don't believe it, then you need a
:better traffic generator.
Now you need to think carefully about what you are actually arguing
about over here. You are making generalizations that are an
incorrect interpretation of what features such as interrupt moderation
and flow control are intended to provide.
What I have said, several times now, is that a reasonably modern cpu
is NO LONGER the bottleneck for routing packets. In otherwords, there
is going to be *plenty* of cpu suds available. The problem isn't cpu
suds, its the latency after new data becomes available before the cpu
is able to clear the RX ring.
But, and this is where you are misinterpreting the features, the problem
is that there is ANY latency, it is simply that there is too MUCH latency
*SOMETIMES*. The whole point of having a receive ring and flow control
is to INCREASE the amount of latency that can be tolerated before packets
This in turn gives the operating system a far better ability to manage
its processing latencies. All the operating system has to guarentee
to avoid losing packets is that latency does not exceed a certain
calculated value, NOT That the latency has to be minimized. There is
a big difference between those two concepts.
Lets take an example. Including required on-the-wire PAD I think the
minimum packet size is around 64-128 bytes. I'd have to look up the
standard to know for sure (but its 64 bytes worth on 100BaseT, and
probably something similar for GigE). So lets just say its 64 bytes.
That is approximately 576 to 650 bits on the wire. Lets say 576 bits.
Now lets say you have a 256 entry receive ring and your interrupt
moderation is set so you get around 12 packets per interrupt. So your
effective receive ring is 256 - 12 or 244 entries for 576 bits per
entry if minimally sized packets are being routed. That's around 140,000
bits, or 140 uS.
So in such a configuration routing minimally sized packets the
operating system must respond to an interrupt within 140 uS to avoid
flow control being activated.
If we take a more likely scenario... packets with an average size of,
say, 256 bytes (remember that a TCP/IP header is 40 bytes just in itself),
you wind up with around 550,000 bits to fill the receive ring or
a required interrupt latency of no more then 550 uS.
550 uS is a very long time. Even 140 uS is quite a long time (keep in
mind that most system calls takes less then 5 uS to execute, and many
takes less then 1). Most interrupt service routines take only a few
microseconds to execute. Even clearing 200 entries out of a receive
ring would not take more then 10-15 uS. So 140 uS is likely to be
achievable WITHOUT having to resort to real time scheduling or other
If you have a bunch of interfaces all having to clear nearly whole
rings (200+ entries) then it can start to get a little iffy, but even
there it would take quite a few interfaces to saturate even a single
cpu, let alone multiple cpus in an MP setup.
The key thing to remember here is that the goal here NOT to minimize
interrupt latency, but instead to simply guarentee that interrupt
processing latency does not exceed the ring calculation. That's the
ONLY thing we care about.
While it is true in one sense that minimizing interrupt latency gives
you a bit more margin, the problem with that sort of reasoning is that
the cost of minimizing interrupt latency is often to be far less
cpu-efficient, which means you actually wind up being able to handle
FEWER network interfaces instead of the greater number of network
interfaces you thought you'd be able to handle.
I can think of a number of polling schemes that would be able to
improve overall throughput in a dedicated routing environment, but
they would only be required in the most extreme of situations where
you are already knocking up against the processing capabilities of your
cpu. And as I have said (many times), it takes a lot to actuall
saturate a modern cpu.
More information about the Users