MK.
MK.

Reputation: 34587

ANTLR - putting full matched text into the AST

I'm trying to have my AST nodes contain full text for the matched rule. Trying

statement: 'blah' subrule ';' -> ^(MY_STATEMENT subrule $text);

doesn't work. I'm looking at the instances of CommonTree I get back and they do contain startIndex and stopIndex fields but they appear to be indexes into the token stream and I don't see a clear way of reassembling the matched text from that. This seems like such an obvious thing, can it be done?

Upvotes: 1

Views: 416

Answers (1)

Bart Kiers
Bart Kiers

Reputation: 170227

Sure, you can put square brackets after an imaginary token to put custom text in an (imaginary) AST node:

tokens {
  TEXT;
}

rule
 : subrule1 subrule2 -> ^(TEXT["custom text here"] subrule1)
 ;

But you can't refer to all text that has been matched with $text inside a rewrite rule. Inside rewrite rules, everything that starts with a $, is considered to be a label/variable of a rule:

rule
 : text=subrule1 other=subrule2 -> ^($text $other)
 ;

$text can only be used inside parser (or lexer) rules to grab all text the rule matched:

rule
 : subrule1 subrule2 {System.out.println("I matched: " + $text);} -> ^(...)
 ;

To get the text inside your rewrite rule, do it like this:

grammar T;

options { 
 output=AST; 
}

tokens { 
 MY_STATEMENT;
 TEXT;
}

statement
 : 'blah' subrule ';' -> ^(MY_STATEMENT subrule TEXT[$statement.text])
 ;

subrule
 : Digit Digit
 ;

Digit
 : '0'..'9'
 ;

which will parse the input "blah78;" into the following AST:

enter image description here

Upvotes: 1

Related Questions