What are the different token types in C++ compilation?

Question

Walter Bright's article on C++ Compilation talks about these two phrases

"Conversion to preprocessing tokens."
What is the initial token? What does a preprocessing token look like?

"Conversion of preprocessing tokens to C++ tokens" What is this C++ Token and why wasn't it converted into it at first?

Reference: http://www.drdobbs.com/blogs/cpp/228701711

Oliver Charlesworth · Accepted Answer

A preprocessing token is an element of the grammar of the preprocessor. From [lex.pptoken] in the C++ standard:

preprocessing-token:

header-name

identifier

pp-number

character-literal

user-defined-character-literal

string-literal

user-defined-string-literal

preprocessing-op-or-punc

each non-white-space character that cannot be one of the above

...

A preprocessing token is the minimal lexical element of the language in translation phases 3 through 6.

So the "conversion to preprocessing tokens" is the process of lexing the translation unit and identifying individual tokens.

C++ tokens (really just "tokens") are listed in [lex.token]:

token:

identifier

keyword

literal

operator

punctuator

These only exist after all the other translation phases have occurred (macro expansion and so on).

For more information on the entire process, I suggest reading [lex.phases] in the C++ standard.

What are the different token types in C++ compilation?

Answers (2)

Related Questions