Reputation: 4962
I have a rule in my ANTLR grammar like this:
COMMENT : '/*' (options {greedy=false;} : . )* '*/' ;
This rule simply matches c-style comments, so it will accept any pair of /* and */ with any arbitrary text lying in between, and it works fine.
What I want to do now is capture all the text between the /* and the */ when the rule matches, to make it accessible to an action. Something like this:
COMMENT : '/*' e=((options {greedy=false;} : . )*) '*/' {System.out.println("got: " + $e.text);
This approach doesn't work, during parsing it gives "no viable alternative" upon reaching the first character after the "/*"
I'm not really clear on if/how this can be done - any suggestions or guidance welcome, thanks.
Upvotes: 0
Views: 301
Reputation: 170227
Note that you can simply do:
getText().substring(2, getText().length()-2)
on the COMMENT
token since the first and the last 2 characters will always be /*
and */
.
You could also remove the options {greedy=false;} :
since both .*
and .+
are ungreedy (although without the .
they are greedy) (i).
Or use setText(...)
on the Comment
token to discard the /*
and */
immediately. A little demo:
file T.g
:
grammar T;
@parser::members {
public static void main(String[] args) throws Exception {
ANTLRStringStream in = new ANTLRStringStream(
"/* abc */ \n" +
" \n" +
"/* \n" +
" DEF \n" +
"*/ "
);
TLexer lexer = new TLexer(in);
CommonTokenStream tokens = new CommonTokenStream(lexer);
TParser parser = new TParser(tokens);
parser.parse();
}
}
parse
: ( Comment {System.out.printf("parsed :: >\%s<\%n", $Comment.getText());} )+ EOF
;
Comment
: '/*' .* '*/' {setText(getText().substring(2, getText().length()-2));}
;
Space
: (' ' | '\t' | '\r' | '\n') {skip();}
;
Then generate a parser & lexer, compile all .java files and run the parser containing the main method:
java -cp antlr-3.2.jar org.antlr.Tool T.g javac -cp antlr-3.2.jar *.java java -cp .:antlr-3.2.jar TParser (or `java -cp .;antlr-3.2.jar TParser` on Windows)
which will produce the following output:
parsed :: > abc <
parsed :: >
DEF
<
(i) The Definitive ANTLR Reference, Chapter 4, Extended BNF Subrules, page 86.
Upvotes: 4
Reputation: 10939
Try this:
COMMENT :
'/*' {StringBuilder comment = new StringBuilder();} ( options {greedy=false;} : c=. {comment.appendCodePoint(c);} )* '*/' {System.out.println(comment.toString());};
Another way which will actually return the StringBuilder object so you can use it in your program:
COMMENT returns [StringBuilder comment]:
'/*' {comment = new StringBuilder();} ( options {greedy=false;} : c=. {comment.append((char)c);} )* '*/';
Upvotes: 1