Reputation: 4873
I am struggling with a grammar that involves typed expressions as well as variable access. The result type of this access is not ascertainable during parsing and is evaluated in a second step. This evaluation is not a problem, but it seems hard to write unambiguous parser rules.
All operations that work on different types (e.g. compare operators) produce a reduce/reduce
conflict. Clearly, this is because the parser can not decide if "x.a = y.b
" should be parsed as "bool_expr EUQAL bool_expr
" or as "num_expr EUQAL num_expr
" because the type is uncertain. However, the result type of the comp_op
rule is certain (as it is always a boolean value).
Is there any solution to this problem without throwing all type information away during parsing and always check it in the evaluation phase?
Here is a shortened grammar example (using ocamllex and ocamlyacc):
comp_op:
| bool_expr EQUAL bool_expr { T.Equiv (T.Wrapper $1, T.Wrapper $3) }
| num_expr EQUAL num_expr { T.Equiv (T.Wrapper $1, T.Wrapper $3) }
/* the wrapper type basically just wraps the expressions to get a common type */
bool_expr:
| TRUE { T.Bool true }
| FALSE { T.Bool false }
| access { T.BoolAccess $1 }
num_expr:
| NUMBER { T.Num $1 }
| access { T.NumAccess $1 }
access:
/* some more complex rules describing the access to variables */
Thanks for your help.
Upvotes: 2
Views: 406
Reputation: 1032
As already said, you will have a hard time writing a "type-correct parser" -- depending on your language this might even be impossible.
Anyway, the problem here is, that your grammar does not know the type of the "access" production; as far as I understood, this production resembles reading from variables, the type of which is unknown at parse-time. The way I see it, you either abandon the 100% type-correct parsing OR you find a way of "magically" knowing the type of your variables. You could keep track of type declarations and let the lexer look up the type of a variable it encounters; the lexer then would send a variable-identifier-token based on the type of the variable.
I'm not sure if this approach works as I don't know what your language looks like.
Upvotes: 0
Reputation: 31459
As ygrek says, you should not try to mix parsing and typing. It's much easier to write your parser with only one syntactic category for expressions, and then to have a separate type-checking pass that will sort that out.
Theoretically, this comes from the fact that the distinctions made by typing rules are much finer than what traditional parsing technologies can express. They have been attempt at specifying typing rules more declaratively using, for example, attribute grammars, but your usual LL/LR technology is certainly not a good fit, it's like parsing nested parentheses with a regular expression.
Finally, you should use menhir instead of ocamlyacc, because it's just better. You will have more readable and expressive grammars (named parameters, parametrized rules...), better error reporting and grammar debugging features.
Upvotes: 2