Reputation: 493
I am using Lex/Bison to create a simple scripting language which translates into C. A trans-compiler.
I know Bison can verify syntax, but what about scoping? Can it verify the an identifier that is being used was declared before? Does that have to be done manually? And if so, at which step?
E.g. This is syntax-wise correct. But is should not compile because message is in the wrong scope. In addition, message2 was never declared.
String s1
if (s1=='knock on door')
String message1='Hello';
print message1;
print message2;
Upvotes: 2
Views: 1823
Reputation: 590
One way to handle it in Bison without going through another pass after you obtain AST (abstract syntax tree) is,
pop_scope()
and push_scope()
to a new scope, you switch to a new sibling node of scope with dead scopes information still saved.if
statement), use Bison midrule action feature to invoke push_scope()
at "entrance token" (if
statement, {
block begin, etc.). And invoke pop_scope()
at the main action body.Put in a concrete way, in your language, your original rule may look like:
selection-statement: _IF '(' expression ')' statement {
/* main action */
}
With midrule action, you can insert an action for _IF, and pop, push scopes at corresponding spot directly in Bison, and it becomes:
selection-statement: _IF { push_scope(); } '(' expression ')' statement {
/* main action */
pop_scope();
}
So in your example, at the time symbol message1
is defined, push_scope()
is already being called at the point where parser encounters the _IF
token when Bison is reducing this rule. And then when you try to read message1
out side of its scope, your symbol table interface should return some error and you should be able to throw error message at Bison using yyerror()
.
P.S. When you insert a midrule action, you should shift the reference numbers (e.g., $0
, $1
) for expressions. For a complex rule with midrule actions, this soon gets really messy and hard to read. A useful trick is using Bison named references to avoid mistakenly using wrong reference names.
Upvotes: 0
Reputation: 95392
Parsers (however produced) typically cannot check scoping rules.
To do scope checking in general, in general you have to build the entire AST and then verify the identifiers are well-scoped. For languages which have static scoping with scopes introduced before variable use, you might be able to do this only the fly in rules. For languages with namespaces, the namespace declaration might occur anywhere, and you can't do it without collecting that namespace and thus the entire program. (The namespace declaration might not even be in the same file as the use; now you have parse an entirely different file).
As a consequence, the usual way scope checking is done is after parsing, over the program tree. Normally if you are going to do anything beyond parsing, you need the tree, and the full symbol table anyway, so this isn't really much of an annoyance.
Some early compilers didn't have the luxury of holding the entire tree in memory. Either they had languages with scope declarations before use, or they were implemented in multiple passes.
Upvotes: 2
Reputation: 493
Found the answer. There is actually a third phase called Semantic Analysis. The steps:
Semantic Analysis analyzes the parse tree that is created in Step 2.
Upvotes: 1
Reputation: 42825
Bison does not handle identifier scoping and lookup. That functionality is up to you to provide in rule actions.
Upvotes: 0