Yanick Salzmann
Yanick Salzmann

Reputation: 1498

Antlr4 extremely simple grammar failing

Antlr4 has always been a kind of love-hate relationship for me, but I am currently a bit perplexed. I have started creating a grammar to my best knowledge and then wanted to test it and it didnt work at all. I then reduced it a lot to just a bare minimum example and I managed to make it not work. This is my grammar:

grammar SwiftMtComponentFormat;

separator              : ~ZERO EOF;

ZERO                   : '0';

In my understanding it should anything except a '0' and then expect the end of the file. I have been testing it with the single character input '1' which I had expected to work. However this is what happens:

enter image description here

If i change the ~ZEROto ZERO and change my input from 1 to 0 it actually perfectly matches... For some reason the simple negation does not seem to work. I am failing to understand what the reason here is...

Upvotes: 2

Views: 191

Answers (1)

sepp2k
sepp2k

Reputation: 370152

In a parser rule ~ZERO matches any token that is not a ZERO token. The problem in your case is that ZERO is the only type of token that you defined at all, so any other input will lead to a token recognition error and not get to the parser at all. So if you enter the input 1, the lexer will discard the 1 with a token recognition error and the parser will only see an empty token stream.

To fix this, you can simply define a lexer rule OTHER that matches any character not matched by previous lexer rules:

OTHER: .;

Note that this definition has to go after the definition of ZERO - otherwise it would match 0 as well.

Now the input 1 will produce an OTHER token and ~ZERO will match that token. Of course, you could now replace ~ZERO with OTHER and it wouldn't change anything, but once you add additional tokens, ~ZERO will match those as well whereas OTHER would not.

Upvotes: 2

Related Questions