schneiti
schneiti

Reputation: 435

Mismatched input antlr4 misinterpreting

I got stuck with my compilerproject for university and have trouble parsing the following input

haupt() {
    while(i==2) {
        (5+2)*3
        }
}

with this grammar:

grammar Demo;

@header {
    import java.util.List;
    import java.util.ArrayList;
}

program:
    functionList
    ;

functionList:
    function*
    ;

function:
    'haupt()' '{' stmntList '}'                 #haupt
    |'Integer' ID '(' paramList ')' '{' stmntList '}'   #integerFunction
    | 'String' ID '(' paramList ')' '{' stmntList '}'   #stringFunction
    | 'void' ID '(' paramList ')' '{' stmntList '}'     #voidFunction
    ;

paramList:
    param (',' paramList)?
    ;

param:
    'Integer' ID                                        
    | 'String' ID                                       
    ;

variableList:
    ID (',' variableList)?
    ;


stmntList:
    stmnt (stmntList)?                                      
    ;

stmnt:
    'Integer' ID ';'                                                        #integerStmnt
    | 'String' ID   ';'                                                     #stringStmnt
    |  ID '=' expr  ';'                                                     #varAssignment
    | 'print''(' ID ')'     ';'                                             #printText
    | 'toString' '(' ID ')'';'                                              #convertString
    | 'toInteger''('ID')'';'                                                #convertInteger
    | 'if' '(' boolExpr ')' '{' stmntList '}'  ('else' '{' stmntList '}')?  #elseStmnt  
    | 'for' '(' ID '=' expr ',' boolExpr ',' stmnt ')' '{' stmntList '}'    #forLoop    
    | 'while' '(' boolExpr ')' '{' stmntList '}'                            #whileLoop
    | 'do' '{' stmntList '}' 'while' '(' boolExpr ')'   ';'                 #doWhile
    | 'return' expr             ';'                                         #returnVar
    | ID '(' variableList ')'';'                                            #functionCall               
    ;

boolExpr:
    boolParts ('&&' boolExpr)?                  #logicAnd
    | boolParts ('||' boolExpr)?                #logicOr
    ;

boolParts:
    expr '==' expr                      #isEqual
    | expr '!=' expr                    #isUnequal
    | expr '>' expr                     #biggerThan
    | expr '<' expr                     #smallerThan
    | expr '>=' expr                    #biggerEqual
    | expr '<=' expr                    #smallerEqual
    ;

expr:
    links=expr '+' rechts=product                   #addi
    | links = expr '-' rechts=product               #diff
    |product                            #prod
    ;

product:
    links=product '*' rechts=factor                 #mult
    | links=product '/' rechts=factor               #teil
    | factor                            #fact
    ;

factor:
    '(' expr')'                         #bracket
    | ID                                #var
    | zahl=NUMBER                           #numb
    ;


ID  :       [a-zA-Z]*;
NUMBER  :   '0'|[1-9][0-9]*;
WS:         [\r\n\t ]+ -> skip ;

Because I get the following error message:

line 1:5: mismatched input '(' expecting {<EOF>, '-', '*', '+', '/'}

I think that antlr misinterprets the input and thinks that "haupt" is an ID instead of the first rule of function. How can this happen? I always thought antlr uses the first rule matching?

Thanks for your help!

Upvotes: 0

Views: 565

Answers (3)

Onur
Onur

Reputation: 5211

Regarding your comment:

It works fine for me (note that the ID rule has been slightly changed from your version but it also works with your version).

If you take a look at the token types (the numbers in <>), you'll see that the types for print and abc (i.e. an ID) are different (17('print') and 33(ID)).

grammar:

grammar Demo;

@header {
    import java.util.List;
    import java.util.ArrayList;
}

program:
    functionList
    ;

functionList:
    function*
    ;

function:
    'haupt()' '{' stmntList '}'                 #haupt
    |'Integer' ID '(' paramList ')' '{' stmntList '}'   #integerFunction
    | 'String' ID '(' paramList ')' '{' stmntList '}'   #stringFunction
    | 'void' ID '(' paramList ')' '{' stmntList '}'     #voidFunction
    ;

paramList:
    param (',' paramList)?
    ;

param:
    'Integer' ID
    | 'String' ID
    ;

variableList:
    ID (',' variableList)?
    ;


stmntList:
    stmnt (stmntList)?
    ;

stmnt:
    'Integer' ID ';'                                                        #integerStmnt
    | 'String' ID   ';'                                                     #stringStmnt
    |  ID '=' expr  ';'                                                     #varAssignment
    | 'print''(' ID ')'     ';'                                             #printText
    | 'toString' '(' ID ')'';'                                              #convertString
    | 'toInteger''('ID')'';'                                                #convertInteger
    | 'if' '(' boolExpr ')' '{' stmntList '}'  ('else' '{' stmntList '}')?  #elseStmnt
    | 'for' '(' ID '=' expr ',' boolExpr ',' stmnt ')' '{' stmntList '}'    #forLoop
    | 'while' '(' boolExpr ')' '{' stmntList '}'                            #whileLoop
    | 'do' '{' stmntList '}' 'while' '(' boolExpr ')'   ';'                 #doWhile
    | 'return' expr             ';'                                         #returnVar
    | ID '(' variableList ')'';'                                            #functionCall
    ;

boolExpr:
    boolParts ('&&' boolExpr)?                  #logicAnd
    | boolParts ('||' boolExpr)?                #logicOr
    ;

boolParts:
    expr '==' expr                      #isEqual
    | expr '!=' expr                    #isUnequal
    | expr '>' expr                     #biggerThan
    | expr '<' expr                     #smallerThan
    | expr '>=' expr                    #biggerEqual
    | expr '<=' expr                    #smallerEqual
    ;

expr:
    links=expr '+' rechts=product                   #addi
    | links = expr '-' rechts=product               #diff
    |product                            #prod
    ;

product:
    links=product '*' rechts=factor                 #mult
    | links=product '/' rechts=factor               #teil
    | factor                            #fact
    ;

factor:
    '(' expr')'                         #bracket
    | ID                                #var
    | zahl=NUMBER                           #numb
    ;


ID  :       [a-zA-Z]+;
NUMBER  :   '0'|[1-9][0-9]*;
WS:         [\r\n\t ]+ -> skip ;

test file:

haupt() {
    while(i==2) {
        print(abc);
        }
}

result:

[@0,0:6='haupt()',<18>,1:0]
[@1,8:8='{',<9>,1:8]
[@2,14:18='while',<6>,2:4]
[@3,19:19='(',<20>,2:9]
[@4,20:20='i',<33>,2:10]
[@5,21:22='==',<25>,2:11]
[@6,23:23='2',<34>,2:13]
[@7,24:24=')',<30>,2:14]
[@8,26:26='{',<9>,2:16]
[@9,36:40='print',<17>,3:8]
[@10,41:41='(',<20>,3:13]
[@11,42:44='abc',<33>,3:14]
[@12,45:45=')',<30>,3:17]
[@13,46:46=';',<7>,3:18]
[@14,56:56='}',<13>,4:8]
[@15,58:58='}',<13>,5:0]
[@16,59:58='<EOF>',<-1>,5:1]
(program (functionList (function haupt() { (stmntList (stmnt while ( (boolExpr (boolParts (expr (product (factor i))) == (expr (product (factor 2))))) ) { (stmntList (stmnt print ( abc ) ;)) })) })))

Parse tree

Upvotes: 0

Onur
Onur

Reputation: 5211

I get a different error:

line 3:8 no viable alternative at input '('

which I can explain: (5+2)*3 is no statement (as Ter already pointed out).

You seem to use an odd version of ANTLR...

You should also watch out for the warnings:

warning(146): Path\To\File\Demo.g4:95:0: non-fragment lexer rule 'ID' can match the empty string

This tells you that the empty string would also be an identifier (which is not what you want in most cases). Changing the * to a + helps...

Upvotes: 0

Terence Parr
Terence Parr

Reputation: 5962

I would use 'haupt' '(' ')' as suggested but 'haupt()' should match. In fact it does. The error I get is on line 3. Nothing in statement matches (5+2)*3.

Upvotes: 1

Related Questions