ANTLR Lexer matching the wrong rule

Question

I'm working on a lexer and parser for an old object oriented chat system (MOO in case any readers are familiar with its language). Within this language, any of the below examples are valid floating point numbers:

2.3

3.

.2

3e+5

The language also implements an indexing syntax for extracting one or more characters from a string or list (which is a set of comma separated expressions enclosed in curly braces). The problem arises from the fact that the language supports a range operator inside the index brackets. For example: a = foo[1..3];

I understand that ANTLR wants to match the longest possible match first. Unfortunately this results in the lexer seeing '1..3' as two floating points numbers (1. and .3), rather than two integers with a range operator ('..') between them. Is there any way to solve this short of using lexer modes? Given that the values inside of an indexing expression can be any valid expression, I would have to duplicate a lot of token rules (essentially all but the floating point numbers as I understand it). Now granted I'm new to ANTLR so I'm sure I'm missing something and any help is much appreciated. I will supply my lexer grammar below:

lexer grammar MooLexer;

channels { COMMENTS_CHANNEL }

SINGLE_LINE_COMMENT
    : '//' INPUT_CHARACTER* -> channel(COMMENTS_CHANNEL);

DELIMITED_COMMENT
    : '/*' .*? '*/' -> channel(COMMENTS_CHANNEL);

WS
    :   [ 	
] -> channel(HIDDEN)
    ;

IF
    : I F
    ;

ELSE
    : E L S E
    ;

ELSEIF
    : E L S E I F
    ;

ENDIF
    : E N D I F
    ;

FOR
    : F O R;

ENDFOR
    : E N D F O R;

WHILE
    : W H I L E
    ;

ENDWHILE
    : E N D W H I L E
    ;

FORK
    : F O R K
    ;

ENDFORK
    : E N D F O R K
    ;

RETURN
    : R E T U R N
    ;

BREAK
    : B R E A K
    ;

CONTINUE
    : C O N T I N U E
    ;

TRY
    : T R Y
    ;

EXCEPT
    : E X C E P T
    ;

ENDTRY
    : E N D T R Y
    ;

IN
    : I N
    ;

SPLICER
    : '@';

UNDERSCORE
    : '_';

DOLLAR
    : '$';

SEMI
    : ';';

COLON
    : ':';

DOT
    : '.';

COMMA
    : ',';

BANG
    : '!';

OPEN_QUOTE
    : '`';

SINGLE_QUOTE
    : '\'';

LEFT_BRACKET
    : '[';

RIGHT_BRACKET
    : ']';

LEFT_CURLY_BRACE
    : '{';

RIGHT_CURLY_BRACE
    : '}';

LEFT_PARENTHESIS
    : '(';

RIGHT_PARENTHESIS
    : ')';

PLUS
    : '+';

MINUS
    : '-';

STAR
    : '*';

DIV
    : '/';

PERCENT
    : '%';

PIPE
    : '|';

CARET
    : '^';

ASSIGNMENT
    : '=';

QMARK
    : '?';

OP_AND
    : '&&';

OP_OR
    : '||';

OP_EQUALS
    : '==';

OP_NOT_EQUAL
    : '!=';

OP_LESS_THAN
    : '<';

OP_GREATER_THAN
    : '>';

OP_LESS_THAN_OR_EQUAL_TO
    : '<=';

OP_GREATER_THAN_OR_EQUAL_TO
    : '>=';

RANGE
    : '..';

ERROR
    : 'E_NONE'
    | 'E_TYPE'
    | 'E_DIV'
    | 'E_PERM'
    | 'E_PROPNF'
    | 'E_VERBNF'
    | 'E_VARNF'
    | 'E_INVIND'
    | 'E_RECMOVE'
    | 'E_MAXREC'
    | 'E_RANGE'
    | 'E_ARGS'
    | 'E_NACC'
    | 'E_INVARG'
    | 'E_QUOTA'
    | 'E_FLOAT'
    ;

OBJECT
    : '#' DIGIT+
    | '#-' DIGIT+
    ;

STRING 
    : '"' ( ESC | [ !] | [#-[] | [\]-~] | [	] )* '"';

INTEGER
    : DIGIT+;

FLOAT
    : DIGIT+ [.] (DIGIT*)? (EXPONENTNOTATION EXPONENTSIGN DIGIT+)? 
    | [.] DIGIT+ (EXPONENTNOTATION EXPONENTSIGN DIGIT+)? 
    | DIGIT+ EXPONENTNOTATION EXPONENTSIGN DIGIT+
    ;

IDENTIFIER
    : (LETTER | DIGIT | UNDERSCORE)+
    ;

LETTER
    : LOWERCASE 
    | UPPERCASE
    ;

/* 
 * fragments 
 */

fragment LOWERCASE  
    : [a-z] ;

fragment UPPERCASE  
    : [A-Z] ;

fragment EXPONENTNOTATION
    : ('E' | 'e');

fragment EXPONENTSIGN
    : ('-' | '+');

fragment DIGIT 
    : [0-9] ;

fragment ESC 
    : '\"' | '\\' ;

fragment INPUT_CHARACTER
    : ~[
\u0085\u2028\u2029];

fragment A : [aA];
fragment B : [bB];
fragment C : [cC];
fragment D : [dD];
fragment E : [eE];
fragment F : [fF];
fragment G : [gG];
fragment H : [hH];
fragment I : [iI];
fragment J : [jJ];
fragment K : [kK];
fragment L : [lL];
fragment M : [mM];
fragment N : [nN];
fragment O : [oO];
fragment P : [pP];
fragment Q : [qQ];
fragment R : [rR];
fragment S : [sS];
fragment T : [tT];
fragment U : [uU];
fragment V : [vV];
fragment W : [wW];
fragment X : [xX];
fragment Y : [yY];
fragment Z : [zZ];

ANTLR Lexer matching the wrong rule

Answers (1)

Related Questions