Reputation: 33395
In early stages of preprocessing C, newlines (unlike other kinds of whitespace outside quotes) are retained; by the time actual parsing begins, they're gone. When exactly are they removed?
5.1.1.2 Translation phases says "7. White-space characters separating tokens are no longer significant" but that's after "6. Adjacent string literal tokens are concatenated" which doesn't seem right, because string literals on separate lines are still concatenated. What am I missing?
6.10.3.2 The # operator says "Each occurrence of white space between the argument’s preprocessing tokens becomes a single space character in the character string literal." Is that an earlier removal of newlines, separate from their removal from the entire file?
Upvotes: 2
Views: 177
Reputation: 78923
You are right that there is a bit of ambiguity in that text. It is clear that newlines are significant up to phase 4, otherwise the preprocessing directives couldn't be executed correctly. What would make "adjacent string literal tokens" is never explained, in particular since whitespace only looses their significance only in phase 7.
My understanding would be that "adjacent tokens" are tokens that are only separated by white space (if any), white space by itself is not considered to form tokens. With that reading it becomes clear that newlines between string literal tokens are removed by phase 6.
Upvotes: 3