nfs (client) directory cacheing bug

Andrew Atrens atrens at nortel.com
Wed Dec 13 13:25:00 PST 2006


Hi Folks,

I'm running 1.6.x on my desktop box these days (recently upgraded from 1.4.1)
and am experiencing some weirdness around my clearcase view_server linux binaries
wrt nfs ..

I've tried various mount options v2, v3, soft, udp, tcp with little affect ..

What *did* make a noticeable improvement however was switching from an SMP to a
UP kernel...

So here's the behaviour, as best as I can describe it ..

Whenever I want to create a element (file) in clearcase, I run mkelem on an
existing file in my view.

mkelem consults a magic file to determine the type of the file and in turn
invokes a file type 'manager' that looks in a directory containing executable
methods for dealing with that type of file -

Here are the 'handlers' for the text-file-data manager -

$ ls -l /opt/rational/clearcase/lib/mgrs/text_file_delta/
total 34
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 annotate@ -> tfdmgr
lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 compare@ -> ../../../bin/cleardiff
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 construct_version@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_branch@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 15:59 create_element@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_version@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 delete_branches_versions@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 get_cont_info@ -> tfdmgr
lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 merge@ -> ../../../bin/cleardiff
-r-xr-xr-x  1 root  bin    27770 Jun  1  2005 tfdmgr*
lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xcompare@ -> ../../../bin/xcleardiff
lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xmerge@ -> ../../../bin/xcleardiff

When I invoke my command, the type handler consults the vob database, which is nfs mounted, for
a temporary file that it never sees... even though the file exists.


Here's the invocation of mkelem -

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:02) --
$ cleartool mkelem -nc LIB.SUB

since this is more regularly failing, I stubbed in my own create_element handler to see what's going on -

$ ls -l /opt/rational/clearcase/lib/mgrs/text_file_delta/
total 34
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 annotate@ -> tfdmgr
lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 compare@ -> ../../../bin/cleardiff
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 construct_version@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_branch@ -> tfdmgr
-rwxrwxr-x  1 root  wheel   5032 Dec 13 15:59 create_element*
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_version@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 delete_branches_versions@ -> tfdmgr
lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 get_cont_info@ -> tfdmgr
lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 merge@ -> ../../../bin/cleardiff
-r-xr-xr-x  1 root  bin    27770 Jun  1  2005 tfdmgr*
lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xcompare@ -> ../../../bin/xcleardiff
lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xmerge@ -> ../../../bin/xcleardiff

Here's the source code for the create_element stub -

#include <sys/types.h>
#include <sys/stat.h>
#include <stdio.h>

int main(int argc, char **argv, char **envp) {
   int x;
   struct stat sb;
   for (x = 0 ; x < argc; x++ )
        puts(argv[x]);

   while ( stat(argv[4], &sb) == -1 ) {
        perror("stat failed");
        sync();
        sleep(1);
   }
   execve("/opt/rational/clearcase/lib/mgrs/text_file_delta/tfdmgr", argv, envp);
}


Now, when I invoke it I see this -

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:16) --
$ cleartool mkelem -nc LIB.SUB
/opt/rational/clearcase/lib/mgrs/text_file_delta/create_element
45806de2
5241b78a.8af011db.8c99.ac:e6:df:90:6a:9f
5241b78e.8af011db.8c99.ac:e6:df:90:6a:9f
5241b792.8af011db.8c99.ac:e6:df:90:6a:9f
/net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
^Z
[1]+  Stopped                 cleartool mkelem -nc LIB.SUB

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
$ ls /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
/net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
$ ls -l /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
-rw-rw-rw-  1 vobroot  opt_ne  0 Dec 13 16:17 /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
$ cp /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1 /tmp

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
$ cat /tmp/tmp_12902.1

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:18) --
$ cat /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1

-- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:18) --
$ fg
cleartool mkelem -nc LIB.SUB
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
stat failed: No such file or directory
^C
Interrupt

Interesting, eh?  The file it's looking for is now there but the process can't see it!  It really seems like a
race condition, but I don't understand why the process never recovers after I've resumed it...

Heh, as interesting as this problem is it's really starting to get annoying. :) I wonder if there's any relation
to the infamous gmake bug ?

Andrew.






More information about the Bugs mailing list