Roy
Roy

Reputation: 153

ANTLR parser example with C++ grammar

I'm trying to use ANTLR for parsing C++ source code, using the ANTLR C++ grammar file.

After generating the lexer, parser and listeners (CPP14BaseListener.java, CPP14Lexer.java, CPP14Listener.java, CPP14Parser.java), trying to run it on a C++ file in this way:

private void parseCppFile(String file) throws IOException {
    String p1 = readFile(new File(file), Charset.forName("UTF-8"));
    System.out.println(p1);
    // Get our lexer
    CPP14Lexer lexer = new CPP14Lexer(new ANTLRInputStream(p1));
    // Get a list of matched tokens
    CommonTokenStream tokens = new CommonTokenStream(lexer);
    // Pass the tokens to the parser
    CPP14Parser parser = new CPP14Parser(tokens);
    // Walk it and attach our listener
    ParseTreeWalker walker = new ParseTreeWalker();
    // Specify our entry point
    ParseTree entryPoint = null;//TODO: what is the entry point?
    walker.walk(new CPP14BaseListener(), entryPoint);
}

My question is - which of the CPP14Parser generated methods to use for getting the entry point of parsing the file? (see TODO comment).

Alternatively, any pointer for a working example showing how to parse a C++ source file, would be great.

Thanks!

Upvotes: 3

Views: 2997

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170158

The entry point of a grammar is usually the rule that ends with EOF. In you case, try the translationunit rule:

ParseTree entryPoint = parser.translationunit();

In case people don't read the comments, I'll add Mike's noteworthy comment to my answer:

... and if that is not the case (ending n EOF) chances are the first parser rule in a grammar is the entry point (especially if it is not called from anywhere). On the other hand in one of my grammars I defined half a dozen other rules which end with EOF (mostly to parse sub elements of my language). Sometimes it's tricky... :-)

Upvotes: 2

Related Questions