IP forwarding performance

Sepherosa Ziehau sepherosa at gmail.com
Wed Dec 19 23:03:28 PST 2012


On Fri, Dec 14, 2012 at 5:47 PM, Sepherosa Ziehau <sepherosa at gmail.com> wrote:
> Hi all,
>
> This email serves as the base performance measurement for further
> network stack optimization (as of git 107282b).

Since bidirectional fast IP forwarding is already max out the GigE
limit, I increase the measurement strength a bit.  The new measurement
is against git 7e1fbcf

>
>
> The hardware:
> mobo ASUS P867H-M
> 4x4G DDR3 memory
> CPU i7-2600 (w/ HT and Turbo Boost enabled, 4C/8T)
> Forwarding NIC Intel 82576EB dual copper

The forwarding NIC is now changed to 82580EB quad copper.

> Packet generator NICs Intel 82571EB dual copper
>
>
> A emx1 <---> igb0 forwarder igb1 <---> emx1 B

The testing topology is changed into following configure:
+---+                 +-----------+                 +---+
|   | emx1 <---> igb0 |           | igb1 <---> emx1 |   |
| A |                 | forwarder |                 | B |
|   | emx2 <---> igb2 |           | igb3 <---> emx2 |   |
+---+                 +-----------+                 +---+

Streams:
A.emx1 <---> B.emx1 (bidirectional)
A.emx2 <---> B.emx2 (bidirectional)

>
> A and "forwarder", B and "forwarder" are directly connected using CAT6 cables.
> Polling(4) is enabled on igb1 and igb0 on "forwarder".  Following
> tunables are in /boot/loader.conf:
> kern.ipc.nmbclusters="524288"
> net.ifpoll.user_frac="10"
> net.ifpoll.status_frac="1000"
> Following sysctl is changed before putting igb1 into polling mode:
> sysctl hw.igb1.npoll_txoff=4

sysctl hw.igb1.npoll_txoff=1
sysctl hw.igb2.npoll_txoff=2
sysctl hw.igb3.npoll_txoff=3

>
>
> First for the users that are only interested in the bulk forwarding
> performance:  The 32 netperf TCP_STREAMs running on A could do
> 941Mbps.
>
>
> Now the tiny packets forwarding performance:
>
> A and B generate 18 bytes UDP datagrams using
> tools/tools/netrate/pktgen.  The destination addresses of the UDP
> datagrams are selected that the generated UDP datagrams will be evenly
> distributed the to the 8 RX queues, which should be common in the
> production environment.
>
> Bidirectional normal IP forwarding:
> 1.42Mpps in each direction, so total 2.84Mpps are forwarded.
> CPU usage:
> On CPUs that are doing TX in addition to RX: 85% ~ 90% (max allowed by
> polling's user_frac)
> On CPUs that are only doing RX: 40% ~ 50%

Two sets of bidirectional normal IP forwarding:
1.03Mpps in each direction, so total 4.12Mpps are forwarded.
CPU usage:
On CPUs that are doing TX in addition to RX: 90% (max allowed by
polling's user_frac)
On CPUs that are only doing RX: 70% ~ 80%
IPI rate on CPUs that are doing TX in addition to RX: ~10K/s

>
> Bidirectional fast IP forwarding: (net.inet.ip.fastforwarding=1)
> 1.48Mpps in each direction, so total 2.96Mpps are forwarded.
> CPU usage:
> On CPUs that are doing TX in addition to RX: 65% ~ 70%
> On CPUs that are doing RX: 30% ~ 40%

Two sets of bidirectional fast IP forwarding: (net.inet.ip.fastforwarding=1)
1.26Mpps in each direction, so total 5.04Mpps are forwarded.
CPU usage:
On CPUs that are doing TX in addition to RX: 90% (max allowed by
polling's user_frac)
On CPUs that are only doing RX: 60% ~ 70%
IPI rate on CPUs that are doing TX in addition to RX: ~10K/s

Best Regards,
sephe

--
Tomorrow Will Never Die



More information about the Users mailing list