Reputation: 1751
In Compiler Construction by Aho Ullman and Sethi, it is given that the input string of characters of the source are read by scanner(lexical analysis) and groups characters into meaningful sequences called lexems,and for each lexeme scanner produces output as a token of the form. like below
<token-name, attribute-value>
e.g position = initial + rate * 60
these characters are group grouped into lexemes and mapped into tokens like
my question is, how these tokens are stored into symbol table? as we are only mapping lexemes into tokens like <id , 1>, <id, 2>..etc. where are we storing values corresponding to these tokens in symbol table? I am aware of the symbol table but, can somebody please tell me the signature of ST which is used here? Is it something like <id, map<token-name, attribute-value>>
??
also for all id
fields(identifiers) which data-structure is being used to store information related to identifiers like name, scope, size, dataType.
And which state ST is generated? because all stages(scanner, parser, semantic analyzer etc) in compiler design uses ST for reference
Another question is when parser asks for next input token then does the scanner reads input token from ST or from input data? Please help me to understand or attribute-value is simply contains the pointer to the symbol table?
Upvotes: 2
Views: 769
Reputation: 241671
During the lexical scan, the only information you have about a symbol is its spelling. So you can't do much more than intern the symbol to avoid multiple dynamic allocation of the symbol's name. (How useful this is depends a lot on your implementation language.)
As the analysis continues, you will accumulate more information about each symbol. In most programming languages, the same name will be associated with multiple objects: some of the associations will be scoped (local variables) while others will be contextual (namespaces and aggregate members, for example). The precise meaning of each lexeme will need to be resolved, but that might not happen even during the initial syntactic parse. (For example, the name of a structure member will need to be associated with the actual member in the object which describes the structure's type, but until you've resolved the type of each expression, you won't know what the structure type is.)
So there is no one answer to this question. There will likely be a lot of different containers in your compiler which associate a name with some collection of information, and they are not likely to all have the same data fields. All that will have to be fleshed out as you write the various phases of your compiler.
Upvotes: 4