Reputation: 255
I am new to Antlr4 and have been wracking my brain for some days now about a behaviour that I simply don't understand. I have the following combined grammar and expect it to fail and report an error, but it doesn't:
grammar MWE;
parse: cell EOF;
cell: WORD;
WORD: ('a'..'z')+;
If I feed it the input
a4
I expect it to not be able to parse it, because I want it to match the whole input string and not just a part of it, as signified by the EOF
. But instead it reports no error (I listen for errors with a errorlistener implementing the IAntlrErrorListener
interface) and gives me the following parse tree:
(parse (cell a) <EOF>)
Why is this?
Upvotes: 5
Views: 663
Reputation: 100059
The error recovery mechanism when input is reached which no lexer rule matches is to drop a character and continue with the next one. In your case, the lexer is dropping the 4
character, so your parser is seeing the equivalent of this input:
a
The solution is to instruct the lexer to create a token for the dropped character rather than ignore it, and pass that token on to the parser where an error will be reported. In the grammar, this rule takes the following form and is always added as the last rule in the grammar. If you have multiple lexer modes, a rule with this form should appear as the last rule in the default mode as well as the last rule in each extra mode.
ErrChar
: .
;
Upvotes: 5