Thousands of "lookupdotdot failed" messages - should I be worried?
A Dog
DragonFly at anthropomorphic.dog
Wed Jun 16 16:37:27 PDT 2021
On Wed, Jun 16, 2021 at 11:06:01AM PDT, Matthew Dillon wrote:
> If you are not seeing any actual I/O errors in the dmesg output, then there
> is probably no issue with the filesystem.
>
> The dotdot warnings might be some edge-case being caused by the null-mounts
> (because the null-mount has a mount point, but its being mounted on top of
> a sub-directory in an underlying filesystem). If you can track down the
> operation that is causing message, it might just wind up being a patch to
> the kernel to get rid of the console warning for that particular case. If
> you can find a simple configuration that I can throw onto a test box to get
> the same error, I can track down the issue and fix it.
>
> -Matt
Thanks for looking into this. Unfortunately the dotdot warnings *are*
coinciding with I/O errors on the client side - just not on the server.
But they do seem to be causing stat() to fail on the client side, not
just on directories but on regular files and symlinks as well (although
doing that probably causes a lookupdotdot on the containing directory
for all I know).
Typically, my shell will inform me of I/O errors attempting to check my
local email in /var/mail which is an NFS mount (yeah, I know we don't
have NFS file locking on the client side and I shouldn't be doing it
this way). I've also seen find(1) suddenly begin to throw up I/O errors
on literally *every* node it encounters on my NFS mounts. Statting
anything in that hierarchy will continue to fail it reattempted.
Curiously, ssh'ing into the server and doing something like "find
$exported_mount >/dev/null" and just letting it sit there and traverse
the PFS on the server side seems to re-enable client access to the
affected hierarchy.
FWIW, the server is running GENERIC at v6.0.0.5.g53d41-RELEASE and
affected PFS's range from a dozen-odd files of less than a meg total, to
millions of files totalling a few terabytes. Clients I've seen this
happen on include FreeBSD 13, DragonFly DEVELOPMENT, and Linux 5.4.
I'll try to monitor the circumstances in which these errors occur and
see if I can find any correlation between them. So far, it just seems
completely random to me. If there's any other info I can provide, I'd
be happy to oblige.
--
A Dog
More information about the Users
mailing list