von spotz
von spotz

Reputation: 905

Grammar, Stack, Terminal-Symbols and Tokens

a parser (or compiler) usually comprises of a tokenizer, which recognizes certain token-types in the series of symbols from the alphabet which is the input.

Such that our parser only reads a stream of tokens. Not raw characters.

On a grammatical tier however, one speaks of Terminals and Nonterminals. But not tokens.

Which means that both the grammar and the stack (let's assume we are using a LL(k) or a LR-family parser) consist of Terminals and Nonterminals. But let's also assume that it makes sense, to use, likewise when speaking of symbols from a grammatical point of view, the token-types for terminal-symbols as well.

Is there a convention on how to represent grammar symbols (terminals, especially) and are the terminals normally "typed" ?

My guess would be multiple inheritance of interfaces - yet there is only one class TokenType, such that an interface wouldn't really make any sense. With IGrammarSymbol you have at least two classes, Terminal and Nonterminal, which can implement that interface.

Sincerely yours

Upvotes: 3

Views: 630

Answers (1)

rici
rici

Reputation: 241791

Every grammar symbol, terminal or non-terminal, is logically a (different) class in the parser. There is no inherent similarity between the different "terminal" classes or the different "non-terminal" classes, and grouping these classes into an inheritance structure derived from "terminal" and "non-terminal" base classes makes little sense.

Each concrete instance of a grammar symbol is associated with zero or more attributes. The semantics and types of the attributes are determined by the symbol's type. It can make sense to derive two different grammar symbol classes which happen to have the same (or similar) attribute collections from the same base type. For example, different kinds of expression-operator non-terminals may all be derived from some base Expression class. Value-carrying terminals (identifiers and literal constants) might also be derived from this class, but purely syntactic terminals are more likely to be derived from a class which has no semantics attributes. (It might still have syntactic attributes which correlate with the source code location. It might at first glance seem that these syntactic attributes only apply to terminals, but it is often convenient for non-terminals to also have syntactic attributes.)

Upvotes: 2

Related Questions