MattW
MattW

Reputation: 135

cpp expansion of macro with no token-string

I am reading on CPP macro expansion and wanted to understand expansion when the (optional) token-string is not provided. I found gcc v4.8.4 does this:

$ cat zz.c
#define B
(B)
|B|
$ gcc -E zz.c
# 1 "zz.c"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "zz.c"

()
| |

Can anyone explain why the expansion is zero spaces in one instance and one in the other?

Upvotes: 7

Views: 357

Answers (3)

flaming.toaster
flaming.toaster

Reputation: 41

edit: see hvd's answer about gcc's preprocessor implementation

This may be to differentiate between the bitwise and logical OR operators.

This sample:

if (x | 4) printf("true\n"); // Bitwise OR, may or may not be true

Is different from:

if (x || 4) printf("true\n"); // Always true

Since they are different operators with different functions, it is necessary for the preprocessor to add whitespace to avoid changing the intended meaning of the statement.

Upvotes: 1

P.P
P.P

Reputation: 121387

The C preprocessor operates on "tokens" and whenever there's a possibility of changing the meaning or ambiguity, it always adds whitespace in order to preserve the meaning.

Consider your example,

(B)

there's no ambiguity or meaning altering whether there's a space between ( and ) added or not irrespective of the macro value of B.

But it's not the case with

|B|

Depending on the macro B, this above could either be || or |something|. So preprocessor is forced to add a whitespace in order to keep C's lexical rules.

The same behaviour can be seen with any other token that could alter the meaning. For example,

#define B +
B+

would produce

+ +

as opposed to

++

for the said reason.

However, this is only the preprocessor that complies to C lexical rules. GCC does have and support an old preprocessor called traditional processor which wouldn't add any extra whitespaces. For example, if you call preprocessor in traditional mode:

gcc -E -traditional-cpp file.c

then

#define B 

(B)
|B|

produce (without the whitespace)

()
||

Upvotes: 7

user743382
user743382

Reputation:

The output of gcc -E intentionally does not match the exact rules specified by the C standard. The C standard does not describe any particular way the preprocessor result should be visible, and does not even require such a way exist.

The only time some sort of preprocessor output is required to be visible is when the # operator is used. And if you use this, you can see that there isn't any space.

flaming.toaster's answer rightly points out that the reason the gcc -E output inserts a space is to prevent the two consecutive |s from being parsed as a single || token. The following program is required to give a diagnostic for the syntax error:

#define EMPTY
int main() { return 0 |EMPTY| 0; }

and the space is there to make sure the compiler still has enough information to actually generate the error.

Upvotes: 4

Related Questions