Skar
Skar

Reputation: 33

How can I access hidden tokens in ANTLR AST?

I am trying to write a manual tree walker in Java for an AST generated by ANTLR V3. The AST is built using island grammers as similar to the one specified in ANTLR: call a rule from a different grammar.

In the AST, I have a node for expression list with each expression as child node. Now I need to know the line numbers of the COMMAs which seperated the expressions. The COMMAs were present in parsing but removed during AST rewrite.

I see some resources(here and here) pointing to the usage of CommonTokenStream.getTokens but I am not sure how I can access the CommonTokenStream while processing the AST. Is there anyway I can get the CommonTokenStream used to build the AST?

Upvotes: 1

Views: 720

Answers (1)

Apalala
Apalala

Reputation: 9244

The complete list of tokens is accessible through CommonTokenStream.getTokens(), which you can call before you call the tree walker. The list of tokens would be an argument to the walker. There's no need to change CommonTree, unless you want the recovered information embedded in the tree.

I've used the token list to associate hidden tokens such as comments and explicit line numbers (think FORTRAN) with the closest visible token. This was done post-processing the AST and looking at the line, column, and char-index information which is available for both the tokens in the list and the nodes in the AST.

My attempts at trying to that during AST construction resulted in hacky, unmaintainable code. The post-processing code, OTOH, is Programming-101 algorithmic.

Upvotes: 1

Related Questions