Thousands of "lookupdotdot failed" messages - should I be worried?

A Dog DragonFly at anthropomorphic.dog
Wed Jun 16 16:37:27 PDT 2021


On Wed, Jun 16, 2021 at 11:06:01AM PDT, Matthew Dillon wrote:
> If you are not seeing any actual I/O errors in the dmesg output, then there
> is probably no issue with the filesystem.
> 
> The dotdot warnings might be some edge-case being caused by the null-mounts
> (because the null-mount has a mount point, but its being mounted on top of
> a sub-directory in an underlying filesystem).  If you can track down the
> operation that is causing message, it might just wind up being a patch to
> the kernel to get rid of the console warning for that particular case.  If
> you can find a simple configuration that I can throw onto a test box to get
> the same error, I can track down the issue and fix it.
> 
> -Matt

Thanks for looking into this.  Unfortunately the dotdot warnings *are* 
coinciding with I/O errors on the client side - just not on the server.  
But they do seem to be causing stat() to fail on the client side, not 
just on directories but on regular files and symlinks as well (although 
doing that probably causes a lookupdotdot on the containing directory 
for all I know).

Typically, my shell will inform me of I/O errors attempting to check my 
local email in /var/mail which is an NFS mount (yeah, I know we don't 
have NFS file locking on the client side and I shouldn't be doing it 
this way).  I've also seen find(1) suddenly begin to throw up I/O errors 
on literally *every* node it encounters on my NFS mounts.  Statting 
anything in that hierarchy will continue to fail it reattempted.

Curiously, ssh'ing into the server and doing something like "find 
$exported_mount >/dev/null" and just letting it sit there and traverse 
the PFS on the server side seems to re-enable client access to the 
affected hierarchy.

FWIW, the server is running GENERIC at v6.0.0.5.g53d41-RELEASE and 
affected PFS's range from a dozen-odd files of less than a meg total, to 
millions of files totalling a few terabytes.  Clients I've seen this 
happen on include FreeBSD 13, DragonFly DEVELOPMENT, and Linux 5.4.

I'll try to monitor the circumstances in which these errors occur and 
see if I can find any correlation between them.  So far, it just seems 
completely random to me.  If there's any other info I can provide, I'd 
be happy to oblige.

-- 
A Dog



More information about the Users mailing list