Reputation: 313
In ANTLR4 I have the following grammar:
ID : [_a-zA-Z][0-9_a-zA-Z]*;
INT_LITERAL : [0-9]+;
FLOAT_LITERAL :[0-9]+'.'?[0-9]*([eE][-+]?)?[0-9]+;
When parsing the string 123abc, I'm expecting an error but instead I get the tokens:
123
abc
<EOF>
I've tried to add EOF
at the end of my int and float literal regex,
INT_LITERAL : [0-9]+EOF;
FLOAT_LITERAL :[0-9]+'.'?[0-9]*([eE][-+]?)?[0-9]+EOF;
but even then I still get some partial parsing result
bc
<EOF>
What should I modify in order to make my grammar not accept the string 123abc
?
Upvotes: 0
Views: 327
Reputation: 4799
Your lexer produces the correct result.
This type of errors should be handled in a parser, not a lexer. Do you have a parser rule that accepts INT_LITERAL
followed by ID
? I guess you don't. Let the parser do its job. If the rule is missing, the error you're expecting will be thrown, but only at the parsing phase, not lexical analysis.
Upvotes: 2