Reputation: 115
I'm modifying a DSL grammar for a product that is in public use.
Currently all /*...*/
comments are silently ignored, but I need to modify it so that comments that are placed before certain key elements are parsed into the AST.
I need to maintain backwards compatibility whereby users can still add comments arbitrarily throughout the DSL and only those key comments are included.
The parser grammar currently looks a bit like this:
grammar StateGraph;
graph: 'graph' ID '{' graph_body '}';
graph_body: state+;
state: 'state' ID '{' state_body '}';
state_body: transition* ...etc...;
transition: 'transition' (transition_condition) ID ';';
COMMENT: '/*' ( options {greedy=false;} : . )* '*/' {skip();}
Comments placed before the 'graph' and 'state' elements contain meaningful description and annotations and need to be included within the parsed AST. So I've modified those two rules and am no longer skipping COMMENT:
graph: comment* 'graph' ID '{' graph_body '}';
state: comment* 'state' ID '{' state_body '}';
COMMENT: '/*' ( options {greedy=false;} : . )* '*/'
If I naively use the above, the other comments cause mismatched token errors when subsequently executing the tree parser. How do I ignore all instances of COMMENT that are not placed in front of 'graph' or 'state'?
An example DSL would be:
/* Some description
* @some.meta.info
*/
graph myGraph {
/* Some description of the state.
* @some.meta.info about the state
*/
state first {
transition if (true) second; /* this comment ignored */
}
state second {
}
/* this comment ignored */
}
Upvotes: 2
Views: 208
Reputation: 115
This is the solution I've actually got working. I'd love feedback.
The basic idea is to send comments to the HIDDEN channel, manually extract them in the places where I want them, and to use rewrite rules to re-insert the comments where needed. The extraction step is inspired by the information here: http://www.antlr.org/wiki/pages/viewpage.action?pageId=557063.
The grammar is now:
grammar StateGraph;
@tokens { COMMENTS; }
@members {
// matches comments immediately preceding specified token on any channel -> ^(COMMENTS COMMENT*)
CommonTree treeOfCommentsBefore(Token token) {
List<Token> comments = new ArrayList<Token>();
for (int i=token.getTokenIndex()-1; i >= 0; i--) {
Token t = input.get(i);
if (t.getType() == COMMENT) {
comments.add(t);
}
else if (t.getType() != WS) {
break;
}
}
java.util.Collections.reverse(comments);
CommonTree commentsTree = new CommonTree(new CommonToken(COMMENTS, "COMMENTS"));
for (Token t: comments) {
commentsTree.addChild(new CommonTree(t));
}
return commentsTree;
}
}
graph
: 'graph' ID '{' graph_body '}'
-> ^(ID {treeOfCommentsBefore($start)} graph_body);
graph_body: state+;
state
: 'state' ID '{' state_body '}'
-> ^(ID {treeOfCommentsBefore($start)} staty_body);
state_body: transition* ...etc...;
transition: 'transition' (transition_condition) ID ';';
COMMENT: '/*' .* '*/' {$channel=HIDDEN;}
Upvotes: 1
Reputation: 170308
How do I ignore all instances of COMMENT that are not placed in front of 'graph' or 'state'?
You can do that by checking after the closing "*/"
of a comment if there is either 'graph'
or 'state'
ahead, with some optional spaces in between. If this is the case, don't do anything, and if that's not the case, the predicate fails and you fall through the rule and simply skip()
the comment token.
In ANTLR syntax that would look like:
COMMENT
: '/*' .* '*/' ( (SPACE* (GRAPH | STATE))=> /* do nothing, so keep this token */
| {skip();} /* or else, skip it */
)
;
GRAPH : 'graph';
STATE : 'state';
SPACES : SPACE+ {skip();};
fragment SPACE : ' ' | '\t' | '\r' | '\n';
Note that .*
and .+
are ungreedy by default: no need to set options{greedy=false;}
.
Also, be aware that you don't use SPACES
in your COMMENT
rule since SPACES
executes the skip()
method, when called!
Upvotes: 0
Reputation: 2029
Does this work for you?
grammar StateGraph;
graph: 'graph' ID '{' graph_body '}';
graph_body: state+;
state: .COMMENT 'state' ID '{' state_body '}';
state_body: .COMMENT transition* ...etc...;
transition: 'transition' (transition_condition) ID ';';
COMMENT: '/*' ( options {greedy=false;} : . )* '*/' {skip();}
Upvotes: 0