Reputation: 135
I am reading on CPP macro expansion and wanted to understand expansion when the (optional) token-string is not provided. I found gcc v4.8.4 does this:
$ cat zz.c
#define B
(B)
|B|
$ gcc -E zz.c
# 1 "zz.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "zz.c"
()
| |
Can anyone explain why the expansion is zero spaces in one instance and one in the other?
Upvotes: 7
Views: 357
Reputation: 41
edit: see hvd's answer about gcc's preprocessor implementation
This may be to differentiate between the bitwise and logical OR operators.
This sample:
if (x | 4) printf("true\n"); // Bitwise OR, may or may not be true
Is different from:
if (x || 4) printf("true\n"); // Always true
Since they are different operators with different functions, it is necessary for the preprocessor to add whitespace to avoid changing the intended meaning of the statement.
Upvotes: 1
Reputation: 121387
The C preprocessor operates on "tokens" and whenever there's a possibility of changing the meaning or ambiguity, it always adds whitespace in order to preserve the meaning.
Consider your example,
(B)
there's no ambiguity or meaning altering whether there's a space between (
and )
added or not irrespective of the macro value of B
.
But it's not the case with
|B|
Depending on the macro B
, this above could either be ||
or |something|
. So preprocessor is forced to add a whitespace in order to keep C's lexical rules.
The same behaviour can be seen with any other token that could alter the meaning. For example,
#define B +
B+
would produce
+ +
as opposed to
++
for the said reason.
However, this is only the preprocessor that complies to C lexical rules. GCC does have and support an old preprocessor called traditional processor which wouldn't add any extra whitespaces. For example, if you call preprocessor in traditional mode:
gcc -E -traditional-cpp file.c
then
#define B
(B)
|B|
produce (without the whitespace)
()
||
Upvotes: 7
Reputation:
The output of gcc -E
intentionally does not match the exact rules specified by the C standard. The C standard does not describe any particular way the preprocessor result should be visible, and does not even require such a way exist.
The only time some sort of preprocessor output is required to be visible is when the #
operator is used. And if you use this, you can see that there isn't any space.
flaming.toaster's answer rightly points out that the reason the gcc -E
output inserts a space is to prevent the two consecutive |
s from being parsed as a single ||
token. The following program is required to give a diagnostic for the syntax error:
#define EMPTY
int main() { return 0 |EMPTY| 0; }
and the space is there to make sure the compiler still has enough information to actually generate the error.
Upvotes: 4