Reputation: 12321
I have a little ANTLR v4 grammer and I am implementing a visitor on it.
Lets say it is a simple calculator and every input must be terminated with a ";"
e.g. x=4+5;
If I do not put the ; at the end, then it is working too but I get a output the teminal.
line 1:56 missing ';' at '<EOF>'
Seems it can find the rule and more or less ignores the missing terminal ";".
I would prefer a strict error or an exception instead of this soft information.
The output is generated by the line
ParseTree tree = parser.input ()
Is there a way I can intensify the error-handling and check for that kind of error?
Upvotes: 2
Views: 593
Reputation: 7409
Yes, you can. Like you, I wanted a 100% perfect parse from user-submitted text and so created a strict error handler that prevents recovery from even simple errors.
The first step is in removing the default error listeners and adding your own STRICT error handler:
AntlrInputStream inputStream = new AntlrInputStream(stream);
BailLexer lexer = new BailLexer(inputStream); // TALK ABOUT THIS AT BOTTOM
CommonTokenStream tokenStream = new CommonTokenStream(lexer);
LISBASICParser parser = new LISBASICParser(tokenStream);
parser.RemoveErrorListeners(); // UNHOOK ERROR HANDLER
parser.ErrorHandler = new StrictErrorStrategy(); // REPLACE WITH YOUR OWN
LISBASICParser.CalculationContext context = parser.calculation();
CalculationVisitor visitor = new CalculationVisitor();
visitor.VisitCalculation(context);
Here's my StrictErrorStrategy class. It inherits from the DefaultErrorStrategy class and overrides the two 'recovery' methods that are letting small errors like your semicolon error be recoverable:
public class StrictErrorStrategy : DefaultErrorStrategy
{
public override void Recover(Parser recognizer, RecognitionException e)
{
IToken token = recognizer.CurrentToken;
string message = string.Format("parse error at line {0}, position {1} right before {2} ", token.Line, token.Column, GetTokenErrorDisplay(token));
throw new Exception(message, e);
}
public override IToken RecoverInline(Parser recognizer)
{
IToken token = recognizer.CurrentToken;
string message = string.Format("parse error at line {0}, position {1} right before {2} ", token.Line, token.Column, GetTokenErrorDisplay(token));
throw new Exception(message, new InputMismatchException(recognizer));
}
public override void Sync(Parser recognizer) { }
}
Overriding these two methods allows you to stop (in this case with an exception that is caught elsewhere) on ANY parser error. And making the Sync method empty prevents the normal 're-sync after error' behavior from happening.
The final step is in catching all LEXER errors. You do this by creating a new class that inherits from your main lexer class; it overrides the Recover() method like so:
public class BailLexer : LISBASICLexer
{
public BailLexer(ICharStream input) : base(input) { }
public override void Recover(LexerNoViableAltException e)
{
string message = string.Format("lex error after token {0} at position {1}", _lasttoken.Text, e.StartIndex);
BasicEnvironment.SyntaxError = message;
BasicEnvironment.ErrorStartIndex = e.StartIndex;
throw new ParseCanceledException(BasicEnvironment.SyntaxError);
}
}
(Edit: In this code, BasicEnvironment
is a high-level context object I used in the application to hold settings, errors, results, etc. So if you decide to use this, either do as another reader commented below, or substitute your own context/container.)
With this in place, even small errors during the lexing step will be caught as well. With these two overridden classes in place, the user of my app must supply absolutely perfect syntax to get a successful execution. There you go!
Upvotes: 4
Reputation: 12321
Because my ANTLR is in Java I add the answer here too. But it is the same idea as the accepted answer.
TempParser parser = new TempParser (tokens);
parser.removeErrorListeners ();
parser.addErrorListener (new BaseErrorListener ()
{
@Override
public void syntaxError (final Recognizer <?,?> recognizer, Object sym, int line, int pos, String msg, RecognitionException e)
{
throw new AssertionError ("ANTLR - syntax-error - line: " + line + ", position: " + pos + ", message: " + msg);
}
});
Upvotes: 2