very weird linux emulation breakage ... rpc, but might be pipe/vm related

Andrew Atrens atrens at nortelnetworks.com
Wed Mar 31 07:34:15 PST 2004


All,

A kernel from March 19 is good, a kernel from March 21 shows the bug.
I am building a few more kernels now to try to pinpoint the time
of breakage.

Andrew.



Andrew Atrens wrote:

> Hi All,
> 
> This is a really weird one ...
> 
> linux cleartool (clearcase) broke sometime last week ... I just
> cvsupped the latest sources and rebuilt and it appears to still
> be there. I wish I could be more exact as to the day of the break,
> but I think I first noticed it last Thursday (at the time I was too
> busy to investigate and chalked it up to some maintainance work
> occurring on our local network ...)
> 
> After a fair bit of investigation I've characterised it as follows -
> 
> Clearcase commands that involve querying a non-DragonFly vob server
> all work, as do commands that query viewservers on non-DragonFly nodes.
> HOWEVER commands that query my (DragonFly) viewserver hang, and
> eventually report rpc timeout.
> 
> It gets even weirder. On the surface the symptoms look as
> though as root (from any node!!!), I can run DragonFly viewserver
> queries, but I can't run them as non-root - with one proviso.
> Sometimes if I run the query as root a few times it seems as though
> something gets cached, because at that point I can repeat the query
> as non-root and have it work...
> 
> I didn't understand how the DragonFly view server could be sensitive to
> the cleartool client being non-root, especially when the client was on
> another box !!! Then when I looked at the creds list (w. ethereal) I
> found that root had more auxillary group creds than my userid. So I added
> myself to all the groups in my /etc/group file, and amazingly found that
> I was able to do much larger queries... that's when I discovered that
> root was fubared for huge queries too. Yeesh.
> 
> I've attached linux_kdumps of the below (local) viewserver queries -
> atrens.txt contains the client trace, view_server-atrens.txt contains
> a trace of the view server during that query attempt - there's a similar
> pair there for root, too.
> 
> Any help/suggestions you folks could provide would be greatly appreciated.
> 
> Andrew.
> 
> 
> -- atrens at atrens:
> /usr/opt/viewstore/atrens_APP_COMMON/vobs/equinox_ne_foundation/cpumon
> (15:35) -- $ ktrace -f ../atrens -id cleartool ls
> Makefile@@/main/ottawa_main/1                            Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout ^C
> Interrupt
> 
> -- atrens at atrens:
> /usr/opt/viewstore/atrens_APP_COMMON/vobs/equinox_ne_foundation/cpumon
> (15:35) -- $ su Password:
> 
> -- root at atrens:
> /usr/opt/viewstore/atrens_APP_COMMON/vobs/equinox_ne_foundation/cpumon
> (15:35) --
> # ktrace -f ../root -id cleartool ls
> Makefile@@/main/ottawa_main/1                            Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout
> cpumon.bun@@/main/ottawa_main/3                          Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout
> includes@@/main/ottawa_main/2                            Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout
> objects@@/main/ottawa_main/1                             Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout
> sources@@/main/ottawa_main/2                             Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout
> 
> -- root at atrens:
> /usr/opt/viewstore/atrens_APP_COMMON/vobs/equinox_ne_foundation/cpumon
> (15:35) --
> # exit
> exit
> 
> -- atrens at atrens:
> /usr/opt/viewstore/atrens_APP_COMMON/vobs/equinox_ne_foundation/cpumon
> (15:35) -- $ ktrace -f ../atrens -id cleartool ls
> Makefile@@/main/ottawa_main/1                            Rule:
> BASE_COMMON_FN81_20040220_0146.6829 -nocheckout ^C
> Interrupt






More information about the Bugs mailing list