Reputation: 151
I'm writing a compiler for the C- language and I only have one more problem to solve: how to handle the dangling else. The original rule looks like:
A --> if (expression) statement | if (expression) statement else statement
After getting rid of the left recursion:
A --> if(expression) statement B
B --> else statement | EMPTY
The problem is that "else" is in the first and follow sets of B. I think this makes sense for an example like this:
if(x>y)
if(x == 10)
printf("x is 10.\n");
else
printf("x<y");
The first if is followed by an else and the second if is followed by the same else, so there is ambiguity in how the rule was applied. I get that I need to pair the else to the closest, open if, but I'm not sure how that translates into code for the parser. When I hit rule A I'll call B, but then what? If I see "else" as the next token, do I use B --> else statement or B --> EMPTY?
Upvotes: 3
Views: 981
Reputation: 182743
The parser is greedy. That is, a statement does not end until it must end. If something can be parsed as a continuation of the current, innermost statement, it is. So the inner if
does not end when the else
is encountered because it can include the following else
, therefore it does.
When the parser encounters the else
, it has two choices -- end the inner statement or don't end the inner statement. Such choices are always resolved in favor of making the statement as big as possible. Hence the term "greedy".
Upvotes: 6