blgrnboy
blgrnboy

Reputation: 5167

Parsing: ANTLR for .NET

I am trying to parse the following text:

<<! notes, Test!>>

Grammar:

grammar Hello;

prog:   stat+;
stat:   DELIMETER_OPEN expr DELIMETER_CLOSE ;
expr:   NOTES value=VAR_VALUE   # delim_body ;
VAR_VALUE   : [ a-Z A-Z 0-9 ! ];
NOTES   : 'notes,'
        |   ' notes,';
DELIMETER_OPEN  :   '<<!';
DELIMETER_CLOSE :   '!>>';

Error:

line 1:12 token recognition error at: '>' 
line 1:13 token recognition error at: '>' 
line 1:10 mismatched input ' !' expecting VAR_VALUE

(NOTE: Added DELIMITER defs since I forgot them earlier)

Upvotes: 0

Views: 267

Answers (1)

GRosenberg
GRosenberg

Reputation: 6001

Try this:

grammar Hello;

prog :      stat+ EOF ;
stat :      DELIMETER_OPEN expr DELIMETER_CLOSE ;
expr :      NOTES COMMA value=VAR_VALUE   # delim_body ;

VAR_VALUE : ANBang* AlphaNum ;
NOTES :    'notes' ;
COMMA :    ','     ;
WS    : [ \t\r\n]+ -> skip ;

DELIMETER_OPEN  :  '<<!';
DELIMETER_CLOSE :  '!>>';

fragment ANBang : AlphaNum | Bang ;
fragment AlphaNum : [a-zA-Z0-9] ;
fragment Bang : '!' ;

Ideally, the rules have to be mutually unambiguous. So, the VAR_VALUE rule is defined to limit the existence of a ! from the end. This will prevent the ! from being consumed by VAR_VALUE in preference to DELIMITER_CLOSE. Of course, that presumes the redefinition is acceptable. If not, a more involved solution will be required.

Also, as a general principle, skip anything that is not syntactically significant to the parsing.

Upvotes: 2

Related Questions