Forrest
Forrest

Reputation: 721

ANTLR4 Error : Missing NEWLINE

grammar Hello;          
prog:   stat+ EOF;
stat:   expr NEWLINE    # printExpr 
|   ID '=' expr NEWLINE # assign
|   NEWLINE     # blank 
|   STRING NEWLINE  # string
;
expr:   expr (MUL|DIV) expr # opExpr
|   expr (ADD|SUB) expr # opExpr
|   expr AND expr # andExpr
|   INT         # int
|   ID          # id
|   '(' expr ')'        # parens
;
MUL:    '*';
DIV:    '/';
ADD:    '+';
SUB:    '-';
ID: [a-zA-Z]+[0-9a-zA-Z]*;
NEWLINE : [\r\n] ;
INT     : [0-9]+ ;
AND :     '&';
WS  : [ \t\r\n]+ -> skip;
CM  : '//' ~[\r\n]* -> skip;`

Can someone explain to me what is wrong with my code? This is my error :

error

Your help will be appreciated !

Upvotes: 2

Views: 2647

Answers (1)

Mephy
Mephy

Reputation: 2986

The problem is in these lexer rules:

NEWLINE : [\r\n] ;
WS : [ \t\r\n]+ -> skip;

When the lexer finds a \r\n in the input string, it will try to match the rules, and both will match. However, WS will match the entire \r\n, producing one WS token, and NEWLINE will match \r then \n, producing two NEWLINE tokens.

In this scenario, Antlr always chooses the longest match, in your case it will produce WS. If you look at your lexer output for a = 3\r\nx = 4\r\n, the generated tokens will be:

ID WS '=' WS INT WS    ID WS '=' WS INT WS
a      =      3 \r\n   x      =     4   \r\n

But what you're looking for is:

ID WS '=' WS INT NEWLINE ID WS '=' WS INT NEWLINE
a      =      3  \r\n    x      =     4   \r\n

Your grammar seems to be written entirely expecting all the line breaks to generate NEWLINE tokens, so I suggest changing the WS rule to:

WS: [ \t]+ -> skip;

Upvotes: 2

Related Questions