Reputation: 4094
I inherited an ANTLR grammar and now implement it without any change with Python Lex Yacc. My Problem is, that ANTLR normally uses very high level EBNF to define a grammar, whereas Yacc uses simple context free grammar.
I struggle with the AST definition of binary expressions. Given from ANTLR, they look like
disjunction
: conjunction ('||' conjunction)*
This definition exists for precedence reasons, do not wonder about the names.
In my CFG, it looks like
def p_sum(self, p):
""" sum : product product_list """
???
def p_product_list(self, p):
""" product_list : PLUS product product_list
| MINUS product product_list
| epsilon
"""
???
Some other expressions look like
def p_disjunction_1(self, p):
""" disjunction : conjunction """
p[0] = p[1]
def p_disjunction_2(self, p):
""" disjunction : conjunction OR disjunction """
p[0] = BinaryOp(p[2], p[1], p[3], p[1].coord)
What I want is that my AST consists only of BinaryExpression Nodes, as given in the second example:
BinaryExpression('+', lhs, rhs)
But this grammar gives me a list of factors. Is there any way I can write it in my way? The ??? are the places where I need to but an expression to build my AST, but I cannot think of a nice way to do it. What I see is to define a node with a list of unary expressions, but that is ugly for me. Or is there another way to write the grammar? I do not dare to touch it, since I fear I will alter it in a way which parses different stuff.
Upvotes: 0
Views: 993
Reputation: 241861
The "original" form of the disjunction
production (before it had left-recursion eliminated to make it appropriate for LL parsing) was
disjunction: conjunction
| disjunction '|' conjunction
;
That will left-associate disjunction; if you want right-association -- for example for assignment operators -- use right-recursion:
assignment: expression
| expression '=' assignment
;
(That might not be precisely what you want on the left-hand side of an assignment operator, but it shows a general pattern for right-associative rules.)
Similarly:
sum: product
| sum '+' product
| sum '-' product
;
The reconstruction of the grammar corresponding to a sensible parse tree should be fairly mechanical, but you probably should write enough test cases, parsing them with both grammars, to assure yourself that you got it correct.
Upvotes: 1