Reputation: 333
I am pretty sure I have conflicting YACC rules (specifically the exp exp
and group_open exp group_close
rules). I am trying to build a simple boolean query syntax that lets people do stuff like a "b c" -(d or e) which would rougly be equivalent to a AND "b c" AND NOT (d OR e).
However I am having trouble implemnting both the group rule ()
and the AND rule (basically just spaces).
%lex
%%
\s+ ;
or|OR return 'or';
and|AND return 'and';
\"[^\"]+\" return 'phrase';
"-"\b return 'not';
"(" return 'group_open';
")" return 'group_close';
[^\s,]+ return 'word';
/lex
%token space
%token phrase
%token group_open
%token group_close
%token word
%left or
%left and
%left not
%%
query : exp { return $1; }
;
exp : term
| exp or exp { $$ = $1+" OR "+$3; }
| exp and exp { $$ = $1+" AND "+$3; }
/* this is the one that is casuing me issues */
| exp exp { $$ = $1+" AND "+$3; }
| not exp { $$ = "NOT "+$2; }
| group_open exp group_close { $$ = "("+$2+")"; }
;
term : phrase { $$ = "PHRASE"; }
| word { $$ = "WORD"; }
;
Any help would be great.
I am testing my grammar by using jison.org
Below are the errors I am getting
Conflicts encountered:
Resolve S/R conflict (shift by default.)
(1,8, 2,5) -> 1,8Resolve S/R conflict (shift by default.)
(1,9, 2,5) -> 1,9Resolve S/R conflict (shift by default.)
(1,6, 2,5) -> 1,6Resolve S/R conflict (shift by default.)
(1,7, 2,5) -> 1,7Resolve S/R conflict (shift by default.)
(1,4, 2,5) -> 1,4Resolve S/R conflict (shift by default.)
(1,5, 2,5) -> 1,5Resolve S/R conflict (shift by default.)
(1,6, 2,6) -> 1,6Resolve S/R conflict (shift by default.)
(1,7, 2,6) -> 1,7Resolve S/R conflict (shift by default.)
(1,5, 2,6) -> 1,5Resolve S/R conflict (shift by default.)
(1,6, 2,3) -> 1,6Resolve S/R conflict (shift by default.)
(1,7, 2,3) -> 1,7Resolve S/R conflict (shift by default.)
(1,5, 2,3) -> 1,5Resolve S/R conflict (shift by default.)
(1,6, 2,4) -> 1,6Resolve S/R conflict (shift by default.)
(1,7, 2,4) -> 1,7Resolve S/R conflict (shift by default.)
(1,5, 2,4) -> 1,5
Upvotes: 0
Views: 197
Reputation: 126175
Using precedence rules to resolve this conflict really requires understanding the details of how LR parsing works, and how yacc precendece levels are used to resolve shift/reduce conflicts.
Ambiguities in an expression grammar like yours manifest as shift/reduce conflicts where the parser does not know whether to reduce a rule for an operator that it has parsed or shift a token that might lead to some higher precedence operation. If the rule that the shift leads to is higher precedence, then it should be shifted, but sometimes it is tough to know what rule the token will lead to.
In your example, having recognized the RHS of some rule that ends with exp
, and looking at a lookahead token that can be the start of another exp
, it needs to reduce if the rule seen is higher precedence than an exp exp
expression, and shift otherwise. So you need to set the precedence of every token that might start an expression as just lower than the precedence of the exp exp
rule (assuming you want left associativity), and higher than other lower precedence things:
%left or
%nonassoc phrase word group_open not
%left and
%left UNARY
%%
query : exp { return $1; }
;
exp : term
| exp or exp { $$ = $1+" OR "+$3; }
| exp and exp { $$ = $1+" AND "+$3; }
| exp exp %prec and { $$ = $1+" AND "+$2; }
| not exp %prec UNARY { $$ = "NOT "+$2; }
| group_open exp group_close { $$ = "("+$2+")"; }
;
term : phrase { $$ = "PHRASE"; }
| word { $$ = "WORD"; }
;
Note that not
may start a expression, so needs to have lower precedence than exp exp
, so we introduce a new fake token UNARY
that will never be returned by the lexer; it exists soly to give higher precedence to the not exp
rule with the %prec UNARY
directive. Also, the exp exp
rule needs an explicit %prec
directive to give it a precedence level (by default rules get the precedence of the first token on the RHS, but exp exp
has no tokens on the RHS).
The above rules make the precedence of exp exp
and exp and exp
the same and left associative. That means 'a b and c' will be parsed as '(a b) and c', and 'a and b c' will be parsed as '(a and b) c'. If you instead want exp exp
to be higher precedence than exp and exp
, you need to create another fake token with higher precedence than and
and use that for the precedence of exp exp
, moving the %nonassoc
up to be just below that as well.
Alternately, you can avoid yacc precedence rules altogether, and instead rewrite your grammar with multiple exp
rules, one for each precedence level:
query : exp1 { return $1; } ;
exp1 : exp1 or exp2 { $$ = $1+" OR "+$3; }
| exp2 ;
exp2 : exp2 and exp3 { $$ = $1+" AND "+$3; }
| exp2 exp3 { $$ = $1+" AND "+$2; }
| exp3 ;
exp3 : not exp3 { $$ = "NOT "+$2; }
| '(' exp1 ')' { $$ = "("+$2+")"; }
| phrase { $$ = "PHRASE"; }
| word { $$ = "WORD"; } ;
Upvotes: 1