Reputation: 13682
I'm trying to parse multidimensional arrays with YACC. Here is my lvalue definition:
lvalue: ID { EM_debug("got lvalue identifier " + to_String($1));
$$.My_VAR = A_SimpleVar($$.pos, $1);
$$.size = 0;
$$.name = $1;
}
| lvalue L_SQUARE_BRACKET exp R_SQUARE_BRACKET { EM_debug("got lvalue[exp]");
$$.My_VAR = A_SubscriptVar($$.pos, $1.My_VAR, $3.My_AST);
$$.size = $3.My_AST;
$$.name = $1.name;
}
;
For the (simplified) input ia[2]
it prints got lvalue identifier ia
and gives a parsing error when it encounters the left bracket. I don't get why this would not work. It should see the left bracket in its lookahead and shift. It should not reduce immediately like this. How can I prevent it from shifting?
Upvotes: 0
Views: 364
Reputation: 241721
On the contrary, the reduction is completely correct. In order to apply
lvalue: lvalue L_SQUARE_BRACKET exp R_SQUARE_BRACKET
to the input
ia[2]
the parser needs to make ia
into an lvalue
before shifting the [ (assuming that L_SQUARE_BRACKET
is a [, see below). It does this by using the rule lvalue: ID
, so we can expect that rule to run before the [ is shifted.
So that's not the problem, and there's not enough information in the question to provide a better diagnosis. However, for what it's worth, a few notes:
1) Personally, I find it much less error-prone and easier to read to use literal characters in bison rules:
lvalue: lvalue '[' exp ']'
which of course needs to be matched with a flex rule which returns the literal characters:
"["|"]" { return *yytext; }
(or, using the possibly less readable syntax: [][]
which can be extended to a longer list of single character tokens, such as [][(){}<>=+*/-]
: just remember that ]
must come first and -
last in a character class).
It's entirely possible that there is a mismatch between your scanner and your parser which results in the [
not being sent with the correct token type; you certainly need to eliminate that possibility for debugging.
2) Is bison telling you about any conflicts (including shift-reduce conflicts)? Each of these needs to be tracked down and eliminated.
3) How do you know that the syntax error is being generated when the [
is seen? Have you, for example, enabled flex debugging traces (very handy for debugging) and/or bison debugging traces (which I find more useful than scattering print statements in your actions, but YMMV)?
Upvotes: 1
Reputation: 8097
Don't use YACC for lval vs. rval distinguishing. Because an lval is also almost always an rval, it creates reduce/reduce conflicts in the grammar and that makes it non-deterministic.
Use a Semantic Analysis phase to check for lval correctness rather than incorporating it into the YACC grammar.
For reference though, GNU Bison handles reduce/reduce conflicts by reducing by the rule which is defined first in the file. So that might help you temporarily get around your problem.
Upvotes: 1