Reputation: 153
I'm trying to use ANTLR for parsing C++ source code, using the ANTLR C++ grammar file.
After generating the lexer, parser and listeners (CPP14BaseListener.java, CPP14Lexer.java, CPP14Listener.java, CPP14Parser.java), trying to run it on a C++ file in this way:
private void parseCppFile(String file) throws IOException {
String p1 = readFile(new File(file), Charset.forName("UTF-8"));
System.out.println(p1);
// Get our lexer
CPP14Lexer lexer = new CPP14Lexer(new ANTLRInputStream(p1));
// Get a list of matched tokens
CommonTokenStream tokens = new CommonTokenStream(lexer);
// Pass the tokens to the parser
CPP14Parser parser = new CPP14Parser(tokens);
// Walk it and attach our listener
ParseTreeWalker walker = new ParseTreeWalker();
// Specify our entry point
ParseTree entryPoint = null;//TODO: what is the entry point?
walker.walk(new CPP14BaseListener(), entryPoint);
}
My question is - which of the CPP14Parser generated methods to use for getting the entry point of parsing the file? (see TODO comment).
Alternatively, any pointer for a working example showing how to parse a C++ source file, would be great.
Thanks!
Upvotes: 3
Views: 2997
Reputation: 170158
The entry point of a grammar is usually the rule that ends with EOF
. In you case, try the translationunit
rule:
ParseTree entryPoint = parser.translationunit();
In case people don't read the comments, I'll add Mike's noteworthy comment to my answer:
... and if that is not the case (ending n EOF) chances are the first parser rule in a grammar is the entry point (especially if it is not called from anywhere). On the other hand in one of my grammars I defined half a dozen other rules which end with EOF (mostly to parse sub elements of my language). Sometimes it's tricky... :-)
Upvotes: 2