Reputation: 721
grammar Hello;
prog: stat+ EOF;
stat: expr NEWLINE # printExpr
| ID '=' expr NEWLINE # assign
| NEWLINE # blank
| STRING NEWLINE # string
;
expr: expr (MUL|DIV) expr # opExpr
| expr (ADD|SUB) expr # opExpr
| expr AND expr # andExpr
| INT # int
| ID # id
| '(' expr ')' # parens
;
MUL: '*';
DIV: '/';
ADD: '+';
SUB: '-';
ID: [a-zA-Z]+[0-9a-zA-Z]*;
NEWLINE : [\r\n] ;
INT : [0-9]+ ;
AND : '&';
WS : [ \t\r\n]+ -> skip;
CM : '//' ~[\r\n]* -> skip;`
Can someone explain to me what is wrong with my code? This is my error :
Your help will be appreciated !
Upvotes: 2
Views: 2647
Reputation: 2986
The problem is in these lexer rules:
NEWLINE : [\r\n] ;
WS : [ \t\r\n]+ -> skip;
When the lexer finds a \r\n
in the input string, it will try to match the rules, and both will match. However, WS
will match the entire \r\n
, producing one WS
token, and NEWLINE
will match \r
then \n
, producing two NEWLINE
tokens.
In this scenario, Antlr always chooses the longest match, in your case it will produce WS
. If you look at your lexer output for a = 3\r\nx = 4\r\n
, the generated tokens will be:
ID WS '=' WS INT WS ID WS '=' WS INT WS
a = 3 \r\n x = 4 \r\n
But what you're looking for is:
ID WS '=' WS INT NEWLINE ID WS '=' WS INT NEWLINE
a = 3 \r\n x = 4 \r\n
Your grammar seems to be written entirely expecting all the line breaks to generate NEWLINE
tokens, so I suggest changing the WS
rule to:
WS: [ \t]+ -> skip;
Upvotes: 2