Reputation: 407
I am trying to use antlr to parse a log file. Because I am only interested in partial part of the log, I want to only write a partial parser to process important part.
ex: I want to parse the segment:
[ 123 begin ]
So I wrote the grammar:
log :
'[' INT 'begin' ']'
;
INT : '0'..'9'+
;
NEWLINE
: '\r'? '\n'
;
WS
: (' '|'\t')+ {skip();}
;
But the segment may appear at the middle of a line, ex:
111 [ 123 begin ] 222
According to the discussion: What is the wrong with the simple ANTLR grammar? I know why my grammar can't process above statement.
I want to know, is there any way to make antlr ignore any error, and continue to process remaining text?
Thanks for any advice! Leon
Upvotes: 7
Views: 1313
Reputation: 170158
Since '['
might also be skipped in certain cases outside of [ 123 begin ]
, there's no way to handle this in the lexer. You'll have to create a parser rule that matches token(s) to be skipped (see the noise
rule).
You'll also need to create a fall-through rule that matches any character if none of the other lexer rules matches (see the ANY
rule).
A quick demo:
grammar T;
parse
: ( log {System.out.println("log=" + $log.text);}
| noise
)*
EOF
;
log : OBRACK INT BEGIN CBRACK
;
noise
: ~OBRACK // any token except '['
| OBRACK ~INT // a '[' followed by any token except an INT
| OBRACK INT ~BEGIN // a '[', an INT and any token except an BEGIN
| OBRACK INT BEGIN ~CBRACK // a '[', an INT, a BEGIN and any token except ']'
;
BEGIN : 'begin';
OBRACK : '[';
CBRACK : ']';
INT : '0'..'9'+;
NEWLINE : '\r'? '\n';
WS : (' '|'\t')+ {skip();};
ANY : .;
Upvotes: 7