Reputation: 4587
I have the following grammar and I want to parse inputs to get associated ASTs. Everything is easy with ANTLR for Java. Since ANTLR4, in grammar files, you don't have to specify options `output=AST; to get ASTs information.
Hello.g
grammar Hello; // Define a grammar called Hello
stat : expr NEWLINE
| ID '=' expr NEWLINE
| NEWLINE
| expr
;
expr: atom (op atom)* ;
op : '+'|'-' ;
atom : INT | ID;
ID : [a-zA-Z]+ ;
INT : [0-9]+ ;
NEWLINE : '\r' ? '\n' ;
WS : [ \t\r\n]+ -> skip ;
Test.java
import org.antlr.v4.runtime.*;
import org.antlr.v4.runtime.tree.*;
import java.io.*;
import lib.HelloLexer;
import lib.HelloParser;
public class Test {
public static void main(String[] args) throws Exception {
ANTLRInputStream input = new ANTLRInputStream("5 + 3");
// create a lexer that feeds off of input CharStream
HelloLexer lexer = new HelloLexer(input);
// create a buffer of tokens pulled from the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// create a parser that feeds off the tokens buffer
HelloParser parser = new HelloParser(tokens);
ParseTree tree = parser.expr(); // begin parsing at init rule
//System.out(tree.toStringTree(parser)); // print LISP-style tree
System.out.println(tree.toStringTree(parser));
}
}
The output will be:
(expr (atom 5) (op +) (atom 3))
But would you please tell me how to obtain the same result with Python implementation? Currently, I'm using ANTLR 3.1.3 Runtime for Python. The following code only returns "(+ 5 3)"
Test.py
import sys
import antlr3
import antlr3.tree
from antlr3.tree import Tree
from HelloLexer import *
from HelloParser import *
char_stream = antlr3.ANTLRStringStream('5 + 3')
lexer = ExprLexer(char_stream)
tokens = antlr3.CommonTokenStream(lexer)
parser = ExprParser(tokens)
r = parser.stat()
print r.tree.toStringTree()
Upvotes: 1
Views: 1367
Reputation: 1304
There is an antlr4 runtime for Python now (https://theantlrguy.atlassian.net/wiki/display/ANTLR4/Python+Target) but toStringTree is a class method in the Python runtimes. You can call it like this to get the lisp style parse tree including stringified tokens:
from antlr4 import *
from antlr4.tree.Trees import Trees
# import your parser & lexer here
# setup your lexer, stream, parser and tree like normal
print(Trees.toStringTree(tree, None, parser))
# the None is an optional rule names list
Upvotes: 1
Reputation: 100029
There is currently no Python target for ANTLR 4, and ANTLR 3 did not support the automatic generation of parse trees to produce the output you are looking at.
You might be able to use the AST creation functionality in ANTLR 3 to produce a tree, but it will not have the same form (and certainly not the simplicity) of ANTLR 4.
Upvotes: 1