Reputation: 119
I am not sure how to solve this problem without using backtrack=true;.
My sample grammar:
grammar Test;
options {
language = Java;
output = AST;
}
parse : expression
;
expression : binaryExpression
| tupleExpression
;
binaryExpression : addingExpression (('=='|'!='|'<='|'>='|'>'|'<') addingExpression)*
;
addingExpression : multiplyingExpression (('+'|'-') multiplyingExpression)*
;
multiplyingExpression : unaryExpression
(('*'|'/'|'div'|'inter') unaryExpression)*
;
unaryExpression: ('!'|'-')* primitiveElement;
primitiveElement : literalExpression
| id
| sumExpression
| '(' expression ')'
;
sumExpression : 'sum'|'div'|'inter' expression
;
tupleExpression : ('<' expression '>' (',' '<' expression '>')*)
;
literalExpression : INT
;
id : IDENTIFIER
;
// L E X I C A L R U L E S
INT : DIGITS ;
IDENTIFIER : LETTER (LETTER | DIGIT)*;
WS : ( ' '
| '\t'
| '\r'
| '\n'
) {$channel=HIDDEN;}
;
fragment LETTER : ('a'..'z' | 'A'..'Z' | '_') ;
fragment DIGITS: DIGIT+;
fragment DIGIT : '0'..'9';
Is there a way to fix the grammar such a way that no warnings can happen? Let's assume I want to choose both alternatives depending on the case.
Thank you in advance!
Upvotes: 1
Views: 78
Reputation: 170148
Note that:
sumExpression : 'sum'|'div'|'inter' expression
;
gets interpreted as:
sumExpression : 'sum' /* nothing */
| 'div' /* nothing */
| 'inter' expression
;
since the |
has a low precedence. You probably want:
sumExpression : ('sum'|'div'|'inter') expression
;
Let's assume I want to choose both alternatives depending on the case.
That is not possible: you cannot let the parser choose both (or more) alternatives, it can only choose one.
I assume you know why the grammar is ambiguous? If not, here's why: the input A div B
can be parsed in two ways:
unaryExpression 'div' unaryExpression
| |
A B
id sumExpression
| | \
A 'div' B
It looks like you want 'sum'
, 'div'
and 'inter'
to be some sort of unary operator, in which case you could just merge them into your unaryExpression
rule:
unaryExpression : '!' unaryExpression
| '-' unaryExpression
| 'sum' unaryExpression
| 'div' unaryExpression
| 'inter' unaryExpression
| primitiveElement
;
primitiveElement : literalExpression
| id
| '(' expression ')'
;
That way you don't have any ambiguity. Note that A div B
will now be parsed as a multiplyingExpression
and A div sum B
as:
multiplyingExpression
/ \
'div' unaryExpression
/ / \
A 'sum' B
Upvotes: 3