nfs (client) directory cacheing bug

Andrew Atrens atrens at nortel.com
Wed Dec 13 13:51:29 PST 2006


Oops.. little bug ... 4 should be 5 ... hmmm ... a little closer to figuring this out
I suppose.  :)


Andrew Atrens wrote:
> Hi Folks,
> 
> I'm running 1.6.x on my desktop box these days (recently upgraded from 1.4.1)
> and am experiencing some weirdness around my clearcase view_server linux binaries
> wrt nfs ..
> 
> I've tried various mount options v2, v3, soft, udp, tcp with little affect ..
> 
> What *did* make a noticeable improvement however was switching from an SMP to a
> UP kernel...
> 
> So here's the behaviour, as best as I can describe it ..
> 
> Whenever I want to create a element (file) in clearcase, I run mkelem on an
> existing file in my view.
> 
> mkelem consults a magic file to determine the type of the file and in turn
> invokes a file type 'manager' that looks in a directory containing executable
> methods for dealing with that type of file -
> 
> Here are the 'handlers' for the text-file-data manager -
> 
> $ ls -l /opt/rational/clearcase/lib/mgrs/text_file_delta/
> total 34
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 annotate@ -> tfdmgr
> lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 compare@ -> ../../../bin/cleardiff
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 construct_version@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_branch@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 15:59 create_element@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_version@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 delete_branches_versions@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 get_cont_info@ -> tfdmgr
> lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 merge@ -> ../../../bin/cleardiff
> -r-xr-xr-x  1 root  bin    27770 Jun  1  2005 tfdmgr*
> lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xcompare@ -> ../../../bin/xcleardiff
> lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xmerge@ -> ../../../bin/xcleardiff
> 
> When I invoke my command, the type handler consults the vob database, which is nfs mounted, for
> a temporary file that it never sees... even though the file exists.
> 
> 
> Here's the invocation of mkelem -
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:02) --
> $ cleartool mkelem -nc LIB.SUB
> 
> since this is more regularly failing, I stubbed in my own create_element handler to see what's going on -
> 
> $ ls -l /opt/rational/clearcase/lib/mgrs/text_file_delta/
> total 34
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 annotate@ -> tfdmgr
> lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 compare@ -> ../../../bin/cleardiff
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 construct_version@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_branch@ -> tfdmgr
> -rwxrwxr-x  1 root  wheel   5032 Dec 13 15:59 create_element*
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 create_version@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 delete_branches_versions@ -> tfdmgr
> lrwxrwxrwx  1 root  bin        6 Dec 13 00:34 get_cont_info@ -> tfdmgr
> lrwxrwxrwx  1 root  bin       22 Dec 13 00:35 merge@ -> ../../../bin/cleardiff
> -r-xr-xr-x  1 root  bin    27770 Jun  1  2005 tfdmgr*
> lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xcompare@ -> ../../../bin/xcleardiff
> lrwxrwxrwx  1 root  bin       23 Dec 13 00:35 xmerge@ -> ../../../bin/xcleardiff
> 
> Here's the source code for the create_element stub -
> 
> #include <sys/types.h>
> #include <sys/stat.h>
> #include <stdio.h>
> 
> int main(int argc, char **argv, char **envp) {
>    int x;
>    struct stat sb;
>    for (x = 0 ; x < argc; x++ )
>         puts(argv[x]);
> 
>    while ( stat(argv[4], &sb) == -1 ) {
>         perror("stat failed");
>         sync();
>         sleep(1);
>    }
>    execve("/opt/rational/clearcase/lib/mgrs/text_file_delta/tfdmgr", argv, envp);
> }
> 
> 
> Now, when I invoke it I see this -
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:16) --
> $ cleartool mkelem -nc LIB.SUB
> /opt/rational/clearcase/lib/mgrs/text_file_delta/create_element
> 45806de2
> 5241b78a.8af011db.8c99.ac:e6:df:90:6a:9f
> 5241b78e.8af011db.8c99.ac:e6:df:90:6a:9f
> 5241b792.8af011db.8c99.ac:e6:df:90:6a:9f
> /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> ^Z
> [1]+  Stopped                 cleartool mkelem -nc LIB.SUB
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
> $ ls /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
> /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
> $ ls -l /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
> -rw-rw-rw-  1 vobroot  opt_ne  0 Dec 13 16:17 /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
> $ cp /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1 /tmp
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:17) --
> $ cat /tmp/tmp_12902.1
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:18) --
> $ cat /net/zcars0xx/export/vobstore/disk2/OM5K/bcs.vbs/s/sdft/1f/15/tmp_12902.1
> 
> -- atrens at atrens: /localdisk/viewstore/atrens_VxWorks-5.5.2/vobs/bcs/Tornado-2.2.x/docs/bspkit (16:18) --
> $ fg
> cleartool mkelem -nc LIB.SUB
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> stat failed: No such file or directory
> ^C
> Interrupt
> 
> Interesting, eh?  The file it's looking for is now there but the process can't see it!  It really seems like a
> race condition, but I don't understand why the process never recovers after I've resumed it...
> 
> Heh, as interesting as this problem is it's really starting to get annoying. :) I wonder if there's any relation
> to the infamous gmake bug ?
> 
> Andrew.
> 





More information about the Bugs mailing list