Pretty please: no more lower-case macros !!!
Adrian Bocaniciu
a.bocaniciu at computer.org
Thu Jan 6 01:00:50 PST 2005
As I have already mentioned in a follow-up to my recent bug report, the
essential ports net/quagga and net/zebra have been totally broken by the
use of lower-case macros in the DragonFly system headers.
There is an old C tradition that the identifiers used in macro
definitions must be written entirely with capital letters and that no
other identifiers should be written like that. This tradition has not
been caused by esthetic reasons but by the need to separate the name
spaces for macros and for other identifiers.
The way in which preprocessing is implemented in C dictates that this
name space separation is mandatory, otherwise it is not possible to
guarantee the correctness of any C program. C is neither Lisp nor
Scheme nor a language that uses a special marker, e.g. $, for macro
expansions, where the name space overlap does not cause any troubles.
When you have more than 10,000 ports, written by more than 100,000
programmers, one can be certain that almost every possible alphanumeric
string, which is not too long, has already been used by someone,
somewhere, as the name of a local variable. Therefore, it is truly
naive to believe that if a lower-case macro definition is inserted in a
system header used by all these ports, then no name collision will happen.
The following pathological example, which is suitable for a C textbook,
is not made up, it is extracted from the current DragonFly sources :-)
Let's say that programmer Unlucky, which contributes code to the port
#13013, writes the following code fragment:
. ..
#include <some_system_header.h>
. ..
{
. ..
struct s1 var1;
struct s2 var2;
struct s3 var3;
. ..
memset(& var2, MAGIC_VALUE, sizeof var2);
. ..
}
. ..
After testing his program, Unlucky declares proudly that his code is
foolproof. A thousand auditors read his source code and they also
conclude that the code is foolproof.
Then comes DragonFly and the following macro definition is inserted in
some_system_header.h:
#define var2 foo[0]
When the port is recompiled, the compiler will allocate for var2 an
array of null length, whose address will be thus the same with that of
one of the adjacent variables. However, in the sizeof context, foo[0]
will be an array component of type struct s2, therefore the compiler
will translate the memset invocation as it would have been:
memset(& var1, MAGIC_VALUE, sizeof(struct s2));
Thus, unexpectedly, var1 will be destroyed, resulting in more or less
spectacular crashes. Moreover, if struct s2 is large enough, the memset
invocation will also destroy other parts of the stack, including the
return address, leading to even more spectacular crashes. Even more
perverted examples can be easily constructed.
One solution to avoid the name collisions would be to precede each
program block with "#undef" directives for all its local identifiers.
It is obvious that such a stupid solution is not acceptable. The
traditional convention about the use of capital letters to enforce the
name space separation is much more convenient.
In the case of net/route.h, where I first encountered such macros, they
were used to provide alternate names for some structure members. That
can be done in a perfectly safe way without macros, by using anonymous
unions. Even if, for some very stupid reason (as the named unions were
really a mistake in the original C design) the anonymous unions have not
made their way yet into the official C standard, gcc and most other C
compilers support anonymous unions also in C, not only in C++.
Best regards !
More information about the Bugs
mailing list