Andy M
Andy M

Reputation: 119

Remove warnings instead of using backtrack option

I am not sure how to solve this problem without using backtrack=true;.

My sample grammar:

grammar Test;
options {
  language = Java;
  output = AST;
}

parse : expression 
      ;
expression : binaryExpression 
           | tupleExpression 
           ;

binaryExpression : addingExpression (('=='|'!='|'<='|'>='|'>'|'<') addingExpression)*
                 ;

addingExpression : multiplyingExpression (('+'|'-') multiplyingExpression)*
                 ;

multiplyingExpression : unaryExpression 
                        (('*'|'/'|'div'|'inter') unaryExpression)*
                      ;

unaryExpression: ('!'|'-')* primitiveElement;   

primitiveElement : literalExpression
                 | id
                 | sumExpression
                 | '(' expression ')'
                 ;  

sumExpression : 'sum'|'div'|'inter' expression
              ;

tupleExpression : ('<' expression '>' (',' '<' expression '>')*)
                ;

literalExpression : INT
                  ;              

id : IDENTIFIER
   ;

// L E X I C A L   R U L E S      

INT : DIGITS ;   

IDENTIFIER : LETTER (LETTER | DIGIT)*;

WS  :   ( ' '
        | '\t'
        | '\r'
        | '\n'
        ) {$channel=HIDDEN;}
    ;
fragment LETTER : ('a'..'z' | 'A'..'Z' | '_') ;
fragment DIGITS: DIGIT+;
fragment DIGIT : '0'..'9';

Is there a way to fix the grammar such a way that no warnings can happen? Let's assume I want to choose both alternatives depending on the case.

Thank you in advance!

Upvotes: 1

Views: 78

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170148

Note that:

sumExpression : 'sum'|'div'|'inter' expression
              ;

gets interpreted as:

sumExpression : 'sum'   /* nothing */
              | 'div'   /* nothing */
              | 'inter' expression
              ;

since the | has a low precedence. You probably want:

sumExpression : ('sum'|'div'|'inter') expression
              ;

Let's assume I want to choose both alternatives depending on the case.

That is not possible: you cannot let the parser choose both (or more) alternatives, it can only choose one.

I assume you know why the grammar is ambiguous? If not, here's why: the input A div B can be parsed in two ways:

alternative 1

unaryExpression 'div' unaryExpression
      |                     |
      A                     B

alternative 2

id sumExpression
 |       |   \
 A     'div'  B

It looks like you want 'sum', 'div' and 'inter' to be some sort of unary operator, in which case you could just merge them into your unaryExpression rule:

unaryExpression : '!' unaryExpression
                | '-' unaryExpression
                | 'sum' unaryExpression
                | 'div' unaryExpression
                | 'inter' unaryExpression
                | primitiveElement
                ; 

primitiveElement : literalExpression
                 | id
                 | '(' expression ')'
                 ;  

That way you don't have any ambiguity. Note that A div B will now be parsed as a multiplyingExpression and A div sum B as:

multiplyingExpression
       /    \    
    'div'   unaryExpression
    /            / \ 
   A         'sum'  B

Upvotes: 3

Related Questions