Reputation: 6498
I came across this answer for writing a top-down parser: Is there an alternative for flex/bison that is usable on 8-bit embedded systems? but I'm confused on a few points.
Say I have this grammar:
Expr = Id | Id '(' Expr ')'
Stmt = Id '=' Expr | Expr
I'm not sure if it's strictly necessary but I guess we can left-factor the grammar:
Expr = Id ExprRest
ExprRest = ϵ | '(' Expr ')'
Stmt = Id '=' Expr ';' | Expr ';'
How exactly would I write code that properly parses foo(x);
as a Stmt
? If we write the Stmt
parsing code like:
func ParseStmt() {
if ParseId() {
return ParseTerminal('=') && ParseExpr() && ParseTerminal(';');
} else {
return ParseExpr() && ParseTerminal(';');
}
}
We will see the Id
foo
, assume we are in the first case, and then fail because we fail to find a =
following foo
.
Is this grammar not LL(1)?
Upvotes: 1
Views: 283
Reputation: 95430
Your problem is that ID(xxx) can appear in both statement and expression contexts.
As the other answer says, you have to factor more. It isn't that hard. Write your grammar like this:
Stmt = Id ( '=' Expr ';' |
RestOfExpression );
RestOfExpression = '(' Expr ')' |
ϵ );
Expr = Id RestOfExpression;
The code should be obvious.
Upvotes: 0
Reputation: 241971
That grammar is not LL(1)
, because an LL(1)
grammar needs to be able to determine which production to expand given only the parsing stack and the look-ahead token. With an empty parsing stack, you can't tell which of the two Stmt
productions to use given the Id
lookahead token. You could left-factor it, but it gets annoying.
The executables created by bison
, although they occupy a lot of lines of C, are actually pretty compact. Your grammar, compiled by bison, produced 1559 lines of C, but compiled into an object file of 4K, of which (I believe) only slightly more than 1K corresponds to actual code and data. (Of course, that's without actions.)
If you really want to hand-code a parser for an expression grammar, I'd suggest you use either the bottom-up 'Shunting Yard' algorithm or a top-down 'Pratt parser' (both of which are readily searchable with Google), which handle things like operator precedence better than a strict LL grammar. But you might well find that the bison-generated grammar is of comparable size and speed, and better readability and maintainability. (Or you might not. Tastes differ.)
Upvotes: 1