Manish Mulani
Manish Mulani

Reputation: 7375

Static code parser for Java source code to extract methods / comments

I'm looking for a parser that can extract methods from a java class (static source code -> .java file) and method signature, comments / documentation, variables of each of the methods. Preferably in Java programming language.

Could someone please advise?

Thanks.

Upvotes: 5

Views: 4951

Answers (3)

Assaf Gamliel
Assaf Gamliel

Reputation: 12015

Here is what I do to extract the method signatures from a java file/s:

I use Sublime Text 2, to the file I want to get the signatures from and the do a find Ctrl+F with regular expression set for the following Regex I made (I tested it on my code and it works, I hope it will work for you too)

((synchronized +)?(public|private|protected) +(static [a-Z\[\]]+|[a-Z\[\]]+) [a-Z]+\([a-Z ,\[\]]*\)\n?[a-Z ,\t\n]*\{)

After Sublime Text 2 highlight my results I click on "Find All" then copy Ctrl+C, open a new tab Ctrl+N and paste Ctrl+V.
You will then see all your methods signatures.

I hope it helped.

Upvotes: 5

Ira Baxter
Ira Baxter

Reputation: 95430

If all you want is the exact text of each method, and the exact text of the variables inside methods, you could get by with a parser that produces a CST, walking the CST to find the right nodes, and then prettyprinting the found subtrees. ANTLR has a Java parser that would work for this. I don't know if it will capture comments. I think the main distribution of ANTLR is coded in Java.

You can likely do this more hackily, in Java, with a lexer for Java, implementing what amounts to a bad island parser that looks for the key phrases. ("After 'class', find '{' and print out everything you find up to the matching '}'" would give you all the methods and fields).

If you want more precise detail (e.g, you want to know the actual type of an argument rather than just its name, or where the type is actually defined) you'll need a parser with a full front end and name resolution. (ANTLR won't do this.) The Eclipse JDT certainly builds trees; it likely does name resolution. Our DMS Software Reengineering Toolkit with its Java Front End can provide everything necessary for this task, including comment capture and extraction. DMS isn't coded in Java.

You objected to Javadoc as being inadequate, because it doesn't give you the content of methods. Perhaps our Java Source Browser, which does give you that code, would serve better. It integrates name resolution data from our DMS/Java Front End to hyperlink JavaDoc-type information into browsable source text; all fields as well as local variables are explicitly indexed. The Source Browser isn't coded in Java, but then presumably you simply want to run it and scrape your result. Such scraping might be harder than it appears staring at the screen; there's a lot of HTML behind such a display.

Upvotes: 1

Suraj Chandran
Suraj Chandran

Reputation: 24801

You can use ASTParser by eclipse. Its super simple to use.

Find a quick standalone example here.

Upvotes: 8

Related Questions