DP performance
Matthew Dillon
dillon at apollo.backplane.com
Tue Nov 29 10:43:06 PST 2005
:Should we be really that pessimistic about potential MP performance,
:even with two NICs only? Typically packet flows are bi-directional,
:and if we could have one CPU/core taking care of one direction, then
:there should be at least some room for parallelism, especially once the
:parallelized routing tables see the light. Of course provided that
:each NIC is handled by a separate core, and that IPC doesn't become the
:actual bottleneck.
The problem is that if you only have two interfaces, every incoming
packet being routed has to go through both interfaces, which means
that there will be significant memory contention between the two cpus
no matter what you do. This won't degrade the 2xCPUs by 50%... it's
probably more like 20%, but if you only have two ethernet interfaces
and the purpose of the box is to route packets, there isn't much of a
reason to make it an SMP box. cpu's these days are far, far faster then
two measily GigE ethernet interface that can only do 200 MBytes/sec each.
Even more to the point, if you have two interfaces you still only have
200 MBytes/sec worth of packets to contend with, even though each incoming
packet is being shoved out the other interface (for 400 MBytes/sec of
total network traffic). It is still only *one* packet that the cpu is
routing. Even cheap modern cpus can shove around several GBytes/sec
without DMA so 200 MBytes/sec is really nothing to them.
:> Main memory bandwidth used to be an issue but isn't so much any
:> more.
:
:The memory bandwidth isn't but latency _is_ now the major performance
:bottleneck, IMO. DRAM access latencies are now in 50 ns range and will
:not noticeably decrease in the forseeable future. Consider the amount
:of independent memory accesses that need to be performed on per-packet
:...
:Cheers
:
:Marko
No, this is irrelevant. All modern ethernet devices (for the last decade
or more) have DMA engines and fairly significant FIFOs, which means that
nearly all memory accesses are going to be burst accesses capable of
getting fairly close to the maximum burst bandwidth of the memory. I
can't say for sure that this is actually happening without a putting
a logic analyzer on the memory bus, but I'm fairly sure it is. I seem
to recall that the PCI (PCIx, PCIe, etc) bus DMA protocols are all burst
capable protocols.
-Matt
Matthew Dillon
<dillon at xxxxxxxxxxxxx>
More information about the Users
mailing list