Deomachus
Deomachus

Reputation: 179

ANTLR 4.5.2 not recognizing Number token

I'm trying to parse the string "define one: 1." with the following simple demo class:

public class ANTLRDemo {
    public static void main(String[] args)  {
        AremelleLexer lexer = new AremelleLexer(new ANTLRInputStream("define one: 1."));
        CommonTokenStream tokens = new CommonTokenStream(lexer);
        AremelleParser parser = new AremelleParser(tokens);
        ProgramContext p = parser.program();
    }
}

However, I keep running into this error message:

line 1:12 mismatched input '1' expecting {'define', '{', Identifier, Number, String}

The relevant grammar is:

DIGIT
:   '0'..'9'
;

Integer
:   DIGIT+
;

Number
:   Integer (DOT Integer)?
;

Why is "1" not being recognized as a Number?

An interesting note is that the string "define one: 1.0." parses fine, so ANTLR is able to recognize numbers with decimal points, but not integers without decimal points.

Can anybody spot what I'm doing wrong?

Upvotes: 1

Views: 1604

Answers (1)

Lucas Trzesniewski
Lucas Trzesniewski

Reputation: 51390

Your lexer rules are ambiguous.

1 is a token which can be matched by all of your rules: DIGIT, Integer and Number. Note that all 3 of them are lexer rules since their name starts with an uppercase letter.

To disambiguate, ANTLR first chooses the longest token which can match a rule, and when several rules are left, it choses the first one.

So in your case, 1 yields a DIGIT token, but your grammar expects a Number, as the error message says.

I think what you intended to do is to use fragments, which aren't standalone lexer rules but reusable grammar parts:

fragment DIGIT
:   '0'..'9'
;

fragment INTEGER
:   DIGIT+
;

NUMBER
:   Integer (DOT Integer)?
;

With this grammar, all your numbers will always be of the NUMBER token type.

Upvotes: 2

Related Questions