ANTLR: Parsing 2-digit numbers when other numeric literals are also possible

Question

I'm writing a grammar for a moderately sized language, and I'm trying to implement time literals of the form hh:mm:ss.

However, whenever I try to parse, for example, 12:34:56 as a timeLiteral, I get mismatched token exceptions on the digits. Does anyone know what I might be doing wrong?

Here are the relevant rules as currently defined:

timeLiteral
    :   timePair COLON timePair COLON timePair -> ^(TIMELIT timePair*)
    ;

timePair
    :   DecimalDigit DecimalDigit
    ;

NumericLiteral
    : DecimalLiteral
    ;

fragment DecimalLiteral
    : DecimalDigit+ ('.' DecimalDigit+)?
    ;

fragment DecimalDigit
    : ('0'..'9')
    ;

Scott Stanchfield · Accepted Answer

The problem is that the lexer is gobbling the DecimalDigit and returning a NumericLiteral.

The parser will never see DecimalDigits because it is a fragment rule.

I would recommend moving timeLiteral into the lexer (capitalize its name). So you'd have something like

timeLiteral
    :   TimeLiteral -> ^(TIMELIT TimeLiteral*)
    ;

number
    :   DecimalLiteral
    ;

TimeLiteral
    :   DecimalDigit DecimalDigit COLON 
        DecimalDigit DecimalDigit COLON
        DecimalDigit DecimalDigit
    ;

DecimalLiteral
    :   DecimalDigit+ ('.' DecimalDigit+)?
    ;

fragment DecimalDigit
    :   ('0'..'9')
    ;

Keep in mind that the lexer and parser are completely independent. The lexer determines which tokens will be passed to the parser, then the parser gets to group them.

ANTLR: Parsing 2-digit numbers when other numeric literals are also possible

Answers (1)

Related Questions