git: tcp: Implement per-cpu lport cache for listen sockets.
sephe at crater.dragonflybsd.org
Thu Mar 17 05:28:57 PDT 2016
Author: Sepherosa Ziehau <sephe at dragonflybsd.org>
Date: Tue Mar 15 20:45:36 2016 +0800
tcp: Implement per-cpu lport cache for listen sockets.
In order to guard against reincarnation of an accepted connection
after the listen socket is closed, the accepted socket is linked
on to the same global lport hash list as the listen socket. However,
on a busy TCP server, this could cause a lot of contention on this
global lport hash list.
But think about it again: as long as the listen socket is not closed,
reincarnation of an accepted connection is _impossible_, since the
listen socket itself is on the global lport hash list.
Given the above background, this commit changes where an accepted
socket is linked to, before the listen socket is closed:
- Create a per-cpu lport cache for listen socket.
- Accepted sockets are linked to this listen socket per-cpu lport
cache, instead of to the global lport hash list.
- Before the listen socket is closed, all of the sockets on the
listen socket's per-cpu lport cache are merged to the global lport
hash list to prevent reincarnation of these connections.
This greatly reduces the total contention rate on a busy TCP server:
- From 50K/s to 18K/s, if the # of NIC rings does not match the # of
cpus. And it gives ~7% performance improvement (420Kconn/s ->
- From 30K/s to 800/s, if the # of NIC rings matches the # of cpus.
Though this does not give more performance improvement, idle cpu
time is increased a bit.
Summary of changes:
sys/netinet/in_pcb.c | 2 ++
sys/netinet/tcp_subr.c | 70 ++++++++++++++++++++++++++++++++++++++++++++--
sys/netinet/tcp_syncache.c | 6 ++--
sys/netinet/tcp_usrreq.c | 18 ++++++++++++
sys/netinet/tcp_var.h | 53 +++++++++++++++++++++++++++++++++++
5 files changed, 145 insertions(+), 4 deletions(-)
DragonFly BSD source repository
More information about the Commits