Reputation: 8536
I have an antlr grammer with subtrees like this:
^(type ID)
that I want to convert to:
^(type DUMMY ID)
where type is 'a'|'b'.
Note: what I really want to do is convert anonymous instantiations to explicit by generating dummy names.
I've narrowed it down to the grammars below, but I'm getting this:
(a bar) (b bar)
got td
got bu
Exception in thread "main" org.antlr.runtime.tree.RewriteEmptyStreamException: rule type
at org.antlr.runtime.tree.RewriteRuleElementStream._next(RewriteRuleElementStream.java:157)
at org.antlr.runtime.tree.RewriteRuleSubtreeStream.nextNode(RewriteRuleSubtreeStream.java:77)
at Pattern.bu(Pattern.java:382)
The error message continues. My debug so far:
What the proper way to do this kind of a rewrite?
My main grammer Rewrite.g:
grammar Rewrite;
options {
output=AST;
}
@members{
public static void main(String[] args) throws Exception {
RewriteLexer lexer = new RewriteLexer(new ANTLRStringStream("a foo\nb bar"));
RewriteParser parser = new RewriteParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.test().getTree();
System.out.println(tree.toStringTree());
CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
Pattern p = new Pattern(nodes);
CommonTree newtree = (CommonTree) p.downup(tree);
}
}
type
: 'a'
| 'b'
;
test : id+;
id : type ID -> ^(type ID["bar"]);
DUMMY : 'dummy';
ID : ('a'..'z')+;
WS : (' '|'\n'|'r')+ {$channel=HIDDEN;};
and Pattern.g
tree grammar Pattern;
options {
tokenVocab = Rewrite;
ASTLabelType=CommonTree;
output=AST;
filter=true; // tree pattern matching mode
}
topdown
: td
;
bottomup
: bu
;
type
: 'a'
| 'b'
;
td
: ^(type ID) { System.out.println("got td"); }
;
bu
: ^(type ID) { System.out.println("got bu"); }
-> ^(type DUMMY ID)
;
to do compile:
java -cp ../jar/antlr-3.4-complete-no-antlrv2.jar org.antlr.Tool Rewrite.g
java -cp ../jar/antlr-3.4-complete-no-antlrv2.jar org.antlr.Tool Pattern.g
javac -cp ../jar/antlr-3.4-complete-no-antlrv2.jar *.java
java -classpath .:../jar/antlr-3.4-complete-no-antlrv2.jar RewriteParser
EDIT 1: I have also tried using antlr4 and I get the same crash.
Upvotes: 2
Views: 1072
Reputation: 3809
There are two small problems to address to get the rewrite to work, one problem in Rewrite
and the other in Pattern
.
The Rewrite
grammar produces ^(type ID)
as root elements in the output AST, as shown in the output (a bar) (b bar)
. A root element can't be transformed because transforming is actually a form of child-swapping: the element's parent drops the element and replaces it with the new, "transformed" version. Without a parent, you'll get the error Can't set single child to a list
. Adding the root is a matter of creating an imaginary token ROOT
or whatever name you like and referencing it in your entry-level rule's AST generation like so: test : id+ -> ^(ROOT id+);
.
The Pattern
grammar, the one producing the error you're getting, is confused by the type
rule: type : 'a' | 'b' ;
as part of the rewrite. I don't know the low-level details here, but apparently a tree parser doesn't maintain the state of a visited root rule like type
in ^(type ID)
when writing a transform (or maybe it can't or shouldn't, or maybe it's some other limitation). The easiest way to address this is with the following two changes:
ID
in the lexer by changing rule type
in Rewrite
from type: 'a' | 'b';
to just type: ID;
. bu
in Pattern
match against ^(ID ID)
and transform to ^(ID DUMMY ID)
.Now with a couple of minor debugging changes to Rewrite
's main
, input "a foo\nb bar"
produces the following output:
(ROOT (a foo) (b bar))
got td
got bu
(a foo) -> (a DUMMY foo)
got td
got bu
(b bar) -> (b DUMMY bar)
(ROOT (a DUMMY foo) (b DUMMY bar))
Here are the files as I've changed them:
grammar Rewrite;
options {
output=AST;
}
tokens {
ROOT;
}
@members{
public static void main(String[] args) throws Exception {
RewriteLexer lexer = new RewriteLexer(new ANTLRStringStream("a foo\nb bar"));
RewriteParser parser = new RewriteParser(new CommonTokenStream(lexer));
CommonTree tree = (CommonTree)parser.test().getTree();
System.out.println(tree.toStringTree());
CommonTreeNodeStream nodes = new CommonTreeNodeStream(tree);
Pattern p = new Pattern(nodes);
CommonTree newtree = (CommonTree) p.downup(tree, true); //print the transitions to help debugging
System.out.println(newtree.toStringTree()); //print the final result
}
}
type : ID;
test : id+ -> ^(ROOT id+);
id : type ID -> ^(type ID);
DUMMY : 'dummy';
ID : ('a'..'z')+;
WS : (' '|'\n'|'r')+ {$channel=HIDDEN;};
tree grammar Pattern;
options {
tokenVocab = Rewrite;
ASTLabelType=CommonTree;
output=AST;
filter=true; // tree pattern matching mode
}
topdown
: td
;
bottomup
: bu
;
td
: ^(ID ID) { System.out.println("got td"); }
;
bu
: ^(ID ID) { System.out.println("got bu"); }
-> ^(ID DUMMY ID)
;
Upvotes: 2
Reputation: 170188
I don't have much experience with tree-patterns, with or without rewrites. But when using rewrite rules in them, I believe your options should also include rewrite=true;
. The Definitive ANTLR Reference doesn't handle them, so I'm not entirely sure (have a look at the ANTLR wiki for more info).
However, for such (relatively) simple rewrites, you don't really need a separate grammar. You could make DUMMY
an imaginary token and inject it in some other parser rule, like this:
grammar T;
options {
output=AST;
}
tokens {
DUMMY;
}
test : id+;
id : type ID -> ^(type DUMMY["dummy"] ID);
type
: 'a'
| 'b'
;
ID : ('a'..'z')+;
WS : (' '|'\n'|'r')+ {$channel=HIDDEN;};
which would parse the input:
a bar
b foo
into the following AST:
Note that if your lexer is also meant to tokenize the input "dummy"
as a DUMMY
token, change the tokens { ... }
block into this:
tokens {
DUMMY='dummy';
}
and you'd still be able to inject a DUMMY
in other rules.
Upvotes: 1