no viable alternative at input ANTLR4

Question

I'm designing a language that allows you to make predicates on data. Here is my lexer.

lexer grammar Studylexer;


fragment LETTER : [A-Za-z];
fragment DIGIT : [0-9];
fragment TWODIGIT : DIGIT DIGIT;
fragment MONTH:  ('0' [1-9] | '1' [0-2]);
fragment DAY: ('0' [1-9] | '1' [1-9] | '2' [1-9] | '3' [0-1]);



TIMESTAMP: TWODIGIT ':' TWODIGIT; // représentation de la timestamp 

DATE : TWODIGIT TWODIGIT MONTH DAY; // représentation de la date


ID : LETTER+; // match identifiers 

STRING : '"' ( ~ '"' )* '"' ; // match string content

NEWLINE:'
'? '
' ; // return newlines to parser (is end-statement signal)

WS  :   [ 	]+ -> skip ; // toss out whitespace 

LIST: ( LISTSTRING | LISTDATE | LISTTIMESTAMP ) ; // list of variabels;

// list of operators

GT: '>';
LT: '<';
GTEQ: '>=';
LTEQ:'<=';
EQ: '=';
IN: 'in';

fragment LISTSTRING: STRING ',' STRING (',' STRING)*; // list of strings
fragment LISTDATE : DATE   ',' DATE   (',' DATE)*;    // list of dates
fragment LISTTIMESTAMP:TIMESTAMP ',' TIMESTAMP (',' TIMESTAMP )*; // list of timestamps
    
NAMES: 'filename' | 'timestamp' | 'tso' | 'region' | 'processType' | 'businessDate' | 'lastModificationDate'; // name of variables in the where block

KEY: ID '[' NAMES ']' | ID '.' NAMES; // predicat key

and here is a part of my grammar.

expr: KEY op = ('>' | '<') value = (  DATE  | TIMESTAMP )  NEWLINE          # exprGTORLT
    | KEY op = ('>='| '<=') value = ( DATE  | TIMESTAMP )  NEWLINE          #  exprGTEQORLTEQ
    | KEY  '=' value = ( STRING | DATE      | TIMESTAMP )  NEWLINE          # exprEQ
    | KEY 'in'   LIST                                      NEWLINE          #exprIn

When I make a predicate for example.

tab [key]  in  "value1", "value2"

ANTLR generates an error.

no viable alternative at input tab [key] in

What can I do to resolve this problem?

sepp2k · Accepted Answer

First tab [key] does not produce a KEY token like you want it to for two reasons:

It contains spaces and KEY doesn't allow any spaces. The best way to fix that would be to remove the KEY rule from your lexer and instead turn it into a parser rule (meaning you also need to turn [ and ] into their own tokens). Then the white space in your input would be between tokens and thus successfully skipped.
key is not actually one of the words listed in NAMES.

Then another issue is that in is recognized as an ID token, not an IN token. That's because both ID and IN would produce a match of the same length and in cases like that the rule that's listed first takes precedence. So you should define ID after all of the keywords because otherwise the keywords will never be matched.

no viable alternative at input ANTLR4

Answers (1)

Related Questions