NFS daemon is very slow in server-to-client direction in TCP mode

Thu Apr 21 21:38:14 PDT 2016

NFS read performance is primarily client-driven.  usually setting the
read-ahead (for example, mount_nfs -a 4) is the biggest performance driver
for reads.  OpenBSD defaults to -a 1 which is basically no read-ahead.  I
suggest -a 4 at a minimum.  Also make sure the read block size is at least
8192 (which is the default).  Bumping it higher should theoretically work
but see next paragraph.  32768 is reasonable for TCP.  65536 is a bit
excessive (if it fails completely then the TCP buffers are not being sized
properly and the NFS protocol implementation is probably deadlocking due to
the TCP buffer filling up, or due to insufficient TCP buffer space).

In terms of streaming reads over TCP, it should be possible to stream at
the full network bandwidth (~90-100 MBytes/sec) as long as the
implementation creates the TCP connection properly.  The TCP buffers have
to be made relatively large (I would suggest at least 1MB rx and tx
buffers), the TCP connection must also use window scaling and must also be
sure to use TCP_NODELAY.  Otherwise performance will be horrible.   You
might need to enlist one of the devs to check these parameters, there are
often quite a few knobs to turn to tune it properly and defaults were set
very low historically.

NFS write performance is harder.  With a NFSv3 mount (make sure the mount
is NFSv3 and not v2) getting good performance depends almost entirely on
the client-side implementation doing a proper streaming two-phase commit.
A larger buffer size helps too (32768 is reasonable for TCP).  So I can't
make any suggestions there.  A larger TCP buffer may help though (again, at
least a ~1MB TCP buffer is recommended).  The DFly server side won't be a
bottleneck for either reads or writes.

--

When testing read performance be sure to take into account server-side
caching.  A freshly booted server will have to read the file from storage
which with a hard drive might have limited bandwidth (e.g. ~70MB/sec).  On
a remount (umount/mount) to clear the client-side cache, the second run the
file will be fully cached and the read test will be testing purely network
bandwidth (e.g. ~90-100 MB/sec).  If no client-side remount occurs then on
a second or later run some of the data may already be on the client and
result in no network transaction at all (giving very high perceived but
test-incorrect results).

With read-ahead and proper TCP settings the entire data path should be able
to stream at nearly full bandwidth uncached (for whatever the bw limiter
is, network or disk).

-Matt

On Thu, Apr 21, 2016 at 6:11 PM, Predrag Punosevac <punosevac72 at gmail.com>
wrote:

> This is a very interested thread. I just played little bit with dd on my
> desktop machine running OpenBSD 5.9 amd64 NFS-client. NFS server runs on
> DragonFly BSD 4.4.2. No optimization of any kind has been done. My home
> network is 1 Gigabit. I will play over the weekend with various block
> sizes and try to use iozone to get something more interesting.
>
> UDP mount
>
> write: 17.014003 MB/s
> read: 22.397014 MB/s
>
> TCP mount:
>
> write: 9.338817 MB/s
> read: 20.47062 MB/s
>
> Best,
> Predrag
>
> P.S. Yes under the NFS DF is using HAMMER so I am able to get history,
> snapshot, and all that nifty stuff.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.dragonflybsd.org/pipermail/users/attachments/20160421/cb6cf7ae/attachment.html>