Reputation: 6574
i have a problem with the automatic error recovery of ANTLR v3 which doesn't seem to work in my grammar. Consider following grammar:
grammar test;
parse : define*;
define : LPAREN 'define' VARIABLE RPAREN;
// Tokens
LPAREN : '(';
RPAREN : ')';
LETTER : ('a'..'z'|'A'..'Z');
VARIABLE : LETTER*;
SPACE : (' ' | '\n' | '\t' | '\r') {$channel = HIDDEN;};
when i call the parse-rule with following input:
(define alpha)
(define beta)
he successfully parses both define-rules. however, when i enter a token which doesn't fit:
(define alpha)
)
(define beta)
he cancels parsing on the first sight of the misplaced RPAREN token. I thought that antlr could handle misplaced tokens and tries to return to a rule, but it doesn't seem to work for me. What am i doing wrong?
thanks in advance.
Upvotes: 2
Views: 377
Reputation: 170227
That is because when you call the parse
rule:
parse : define*;
the parser tries to match as many define
rules as possible for the input:
(define alpha)
)
(define beta)
After it successfully matches (define alpha)
, it then sees a )
, so it can't match a define
rule anymore and stops parsing therefor. And because )
is a valid token in your lexer grammar, you see no warning or error.
You'll need to tell your parser to go through the entire token stream by "anchoring" your main parser rule by placing the EOF
(end-of-file) token at the end:
parse : define* EOF;
If you now parse the input again, you will see the following error on your console:
line 2:0 missing EOF at ')'
The fact that define*
does not recover is probably because there is no fixed amount of tokens, making the recovery process too hard. The following demo seems to confirm my suspicion:
grammar test;
@parser::members {
public static void main(String[] args) throws Exception {
String source =
"(define alpha) \n" +
") \n" +
"(define beta) ";
testLexer lexer = new testLexer(new ANTLRStringStream(source));
testParser parser = new testParser(new CommonTokenStream(lexer));
parser.parse();
}
}
parse : define define EOF {System.out.println("parsed >>>" + $text + "<<<");};
define : LPAREN 'define' VARIABLE RPAREN;
LPAREN : '(';
RPAREN : ')';
LETTER : ('a'..'z'|'A'..'Z');
VARIABLE : LETTER+;
SPACE : (' ' | '\n' | '\t' | '\r') {$channel = HIDDEN;};
If you run the testParser
class, the following is printed to the console:
line 2:0 extraneous input ')' expecting LPAREN
parsed >>>(define alpha)
)
(define beta) <<<
I.e., the warning is printed to the System.err
, but the parsing also continues when limiting the parse
rule to two define
's instead of define*
.
Upvotes: 2