Network improvement 4.8 -> 5.0 for short-lived HTTP/1.1 workload.
Sepherosa Ziehau
sepherosa at gmail.com
Tue Oct 17 07:57:11 PDT 2017
In this release cycle, several items are committed to improve
performance and reduce/stablize latency.
30K concurrent connections, 1 request/connection, 1KB web object.
Server have 24 HT.
Baseline (32 nginx workers w/ 16 netisrs):
performance 215907.25tps, latency-avg 33.11ms, latency-stdev 41.76ms,
latency-99% 192.36ms.
The performance for 16 nginx workers is too low to be used as baseline
(16 nginx workers w/ 16 netisrs):
performance 191920.81tps, latency-avg 32.04ms, latency-stdev 25.15ms,
latency-99% 101.37ms.
===================
Make # of netisr tunable.
If # of netisrs is set to ncpus, this allows two optmized settings in nginx:
1) Make # of nginx workers same as # of netisrs.
2) CPU-bind nginx workers.
24 nginx workers w/ 24 netisrs:
performance 212556.02tps, latency-avg 56.18ms, latency-stdev 7.90ms,
latency-99% 70.31ms.
24 nginx workers w/ 24 netisrs, cpu-bound nginx workers:
performance 210658.80tps, latency-avg 58.01ms, latency-stdev 5.20ms,
latency-99% 68.73ms.
As you can see, performance dropped a bit. Though average latency is
increased, latency is significantly stablized.
===================
Limit the # of acceptable sockets returned by kevent(2).
24 nginx workers w/ 24 netisrs, cpu-bound nginx workers:
performance 217599.01tps, latency-avg 32.00ms, latency-stdev 2.35ms,
latency-99% 35.59ms.
Compared w/ baseline, performance improved a bit and latency is
reduced a bit. However, latency is significantly stabled.
===================
Summary of the comparison of different web object size:
1KB web object
| perf (tps) | lat-avg | lat-stdev | lat-99%
-------------+------------+---------+-----------+---------
baseline | 215907.25 | 33.11 | 41.76 | 192.36
-------------+------------+---------+-----------+---------
netisr_ncpus | 210658.80 | 58.01 | 5.20 | 68.73
-------------+------------+---------+-----------+---------
kevent.data | 217599.01 | 32.00 | 2.35 | 35.59
8KB web object
| perf (tps) | lat-avg | lat-stdev | lat-99%
-------------+------------+---------+-----------+---------
baseline | 182719.03 | 42.62 | 58.70 | 250.51
-------------+------------+---------+-----------+---------
netisr_ncpus | 181201.11 | 68.78 | 6.43 | 80.68
-------------+------------+---------+-----------+---------
kevent.data | 186324.41 | 37.41 | 4.81 | 48.69
16KB web object
| perf (tps) | lat-avg | lat-stdev | lat-99%
-------------+------------+---------+-----------+---------
baseline | 138625.67 | 72.01 | 65.78 | 304.78
-------------+------------+---------+-----------+---------
netisr_ncpus | 138323.40 | 93.61 | 16.30 | 137.12
-------------+------------+---------+-----------+---------
kevent.data | 138778.11 | 60.90 | 11.80 | 92.07
So performance is improved a bit, latency-avg is reduced by 3%~15%,
latency-stdev is reduced by 82%~94%, latency-99% is reduced by
69%~81%!
+++++++++++++++
And as a bonus, forwarding performance is also improved! We now can
do 13.2Mpps (dual direction forwarding, output packets count) w/
fastforwarding, and 11Mpps w/ normal forwarding.
Thanks,
sephe
--
Tomorrow Will Never Die
More information about the Users
mailing list