dfbsd nfs client - file descriptor leak

Gennady Proskurin gpr at nnz.ru
Wed Apr 23 08:55:57 PDT 2008


netstat output doesn't show any significant differences and seems not
depending on kern.openfiles


I did "ls -l /usr/src/UPDATING" and "ls -l /usr/src/README" on freebsd
and dfbsd clients
Doing tcpdump on clients shows two differences:
- DF flag missing on dfbsd (seems irrelevant here)
- using "fsinfo" on dfbsd client and "access" on freebsd client


from freebsd client:

19:26:10.001804 IP (tos 0x0, ttl  64, id 55387, offset 0, flags [DF],
proto: TCP (6), length: 156) 10.59.3.10.48901580 > 10.59.3.38.2049: 104
access fh 1137,915245/31068487 003f
19:26:10.002255 IP (tos 0x0, ttl  64, id 5316, offset 0, flags [DF],
proto: TCP (6), length: 176) 10.59.3.38.2049 > 10.59.3.10.48901580:
reply ok 124 access attr: DIR 755 ids 1085/1085 sz 512 c 0023
19:26:10.002571 IP (tos 0x0, ttl  64, id 55388, offset 0, flags [DF],
proto: TCP (6), length: 164) 10.59.3.10.48901581 > 10.59.3.38.2049: 112
lookup fh 1137,915245/31068487 "README"
19:26:10.003035 IP (tos 0x0, ttl  64, id 5317, offset 0, flags [DF],
proto: TCP (6), length: 292) 10.59.3.38.2049 > 10.59.3.10.48901581:
reply ok 240 lookup fh 1137,915245/31069628 REG 644 ids 1085/1085 sz
2816
19:26:10.102454 IP (tos 0x0, ttl  64, id 55391, offset 0, flags [DF],
proto: TCP (6), length: 52) 10.59.3.10.726 > 10.59.3.38.2049: ., cksum
0xfa26 (correct), ack 2727185215 win 65535 <nop,nop,timestamp 2748662478
1135145875>


from dfbsd client:

19:28:16.612420 IP (tos 0x0, ttl  64, id 21333, offset 0, flags [none],
proto: TCP (6), length: 148) 10.59.19.2.1451380236 > 10.59.3.38.2049: 96
fsinfo fh 1137,915245/31349706
19:28:16.615021 IP (tos 0x0, ttl  59, id 17993, offset 0, flags [DF],
proto: TCP (6), length: 220) 10.59.3.38.2049 > 10.59.19.2.1451380236:
reply ok 168 fsinfo POST: DIR 755 ids 1085/1085 sz 512 rtmax 32768
rtpref 32768 wtmax 32768 wtpref 32768 dtpref 32768 rtmult 512 wtmult 512
maxfsz 4398046511103 delta 0.000001
19:28:16.615203 IP (tos 0x0, ttl  64, id 30045, offset 0, flags [none],
proto: TCP (6), length: 168) 10.59.19.2.1451380237 > 10.59.3.38.2049:
116 lookup fh 1137,915245/31349706 "UPDATING"
19:28:16.617353 IP (tos 0x0, ttl  59, id 17994, offset 0, flags [DF],
proto: TCP (6), length: 292) 10.59.3.38.2049 > 10.59.19.2.1451380237:
reply ok 240 lookup fh 1137,915245/31353680 REG 644 ids 0/1085 sz 9474
19:28:16.710894 IP (tos 0x0, ttl  64, id 64120, offset 0, flags [none],
proto: TCP (6), length: 52) 10.59.19.2.800 > 10.59.3.38.2049: ., cksum
0x270b (correct), ack 409 win 65535 <nop,nop,timestamp 33949536
1135272468>


10.59.3.10 - freebsd client
10.59.19.2 - dfbsd client
10.59.3.38 - freebsd nfs server

At first glance seems that freebsd has some problems with "fsinfo", I'll
try to dig deeper later
 

-----Original Message-----
From: Rick Macklem [mailto:rick at snowhite.cis.uoguelph.ca] 
Sent: Wednesday, April 23, 2008 6:15 PM
Subject: dfbsd nfs client - file descriptor leak

> The problem is: when accessing files from dfbsd client, nfs server
> "leaks" file descriptors

Hmm, interesting... nfs servers don't open files (nfsv4 has an Open, but
it
is really a type of file lock and not a POSIX like open).

My thought is that it might be doing a lot of reconnects and ending up
with lots of sockets on the server? You could take a look at "netstat
-a" on
the server box while this is happenning and see if there are lots of
connections
from the client (or switch to using UDP and see if that makes the
problem go
away).

One of the problems in an NFS client using TCP is deciding how
long to wait for a response on a TCP connection before giving up and
creating
a new connection. The really old BSD code waited until the TCP layer
decided
the connection was dead, but that could take a very long time. My
current
client (not what is in DragonflyBSD) waits 1 minute, which seems to be
working out pretty well, but...

Good luck with it, rick





More information about the Kernel mailing list