webchatowner
webchatowner

Reputation: 155

How Do I Build an Abstract Syntax Tree with JJTree?

When building an AST and adding children to the tree, what is the difference between:

void NonTerminal #Nonterminal: { Token t;}
{
    t = <MULTIPLY> OtherNonTerminal() {jjtThis.value = t.image;} #Multiply
}

and:

void NonTerminal : { Token t;}
{
    t = <MULTIPLY> OtherNonTerminal() {jjtThis.value = t.image;} #Multiply(2)
}

Note:

<MULTIPLY : "*">

Are there any major differences and will both of these work the same way?

Also would another way of building the tree for this production rule:

void NonTerminal() : { Token t; }
{
    t = <MULTIPLY> OtherNonTerminal() { jjtThis.value = t.image; } #Mult(2)
|   t = <DIVIDE> OtherNonTerminal() { jjtThis.value = t.image; } #Div(2)
|   {}
}

be like this:

void NonTerminal() #Nonterminal(2) : { Token t; }
{
    (t = <MULTIPLY> OtherNonTerminal() | t = <DIVIDE> OtherNonTerminal() | {}) {jjtThis.value = t.image;}
}

Upvotes: 0

Views: 1208

Answers (2)

Theodore Norvell
Theodore Norvell

Reputation: 16221

In the first case

void NonTerminal #Nonterminal: { Token t;}
{
    t = <MULTIPLY>
    OtherNonTerminal() {jjtThis.value = t.image;}
    #Multiply
}

the Multiply node will have as children all the nodes pushed on the stack during its node scope, excluding any that are popped before the end of the scope. In this case that means all the nodes pushed and not popped during the parsing of OtherNonTerminal.

In the second example

void NonTerminal #void : { Token t;}
{
    t = <MULTIPLY>
    OtherNonTerminal() {jjtThis.value = t.image;} 
    #Multiply(2)
}

the Multiply node will get the two top nodes from the stack as its children.

So probably there is a difference.

The other difference is that the second example doesn't specify a node associated with Nonterminal.

In the first case, this tree will be pushed

        Nonterminal
             |
          Multiply
              |
All nodes pushed (but not popped) during the parsing of OtherNonterminal

In the second case, the parsing of OtherNonterminal will do its thing (popping and pushing nodes), then two nodes will the popped and this tree will be pushed

     Multiply
      |     |
  A child  Another child

For the second question. The difference between

void NonTerminal() #void : { Token t; }
{
    t = <MULTIPLY>
    OtherNonTerminal()
    { jjtThis.value = t.image; }
    #Mult(2)
|
    t = <DIVIDE>
    OtherNonTerminal()
    { jjtThis.value = t.image; }
    #Div(2)
|
    {}
}

and

void NonTerminal() #Nonterminal(2) : {
    Token t; }
{
    ( t = <MULTIPLY> OtherNonTerminal()
    | t = <DIVIDE> OtherNonTerminal()
    | {}
    )
    {jjtThis.value = t.image;}
}

is that the first does not build a node when the empty sequence is matched.

Consider the second way in the the case where the next token is something other than * or /. You'll will get

      Nonterminal
      /        \
  Some node    Some other node
  don't want   you don't want

I'm actually surprised that the second one even gets past the Java compiler, since the reference to t is a potentially uninitialized variable.

Upvotes: 1

sarath kumar
sarath kumar

Reputation: 380

Answer to this Question is yes there is a difference.

JAVACC or JJTREE grammar do compilation process in different steps.

  1. Lexical Analysis, where the individual character are collected and try to frame a token with the Regex Provided in TOKEN, SPECIAL_TOKEN, MORE and SKIP sections. After every successful Lexical analysis a token will be generated.
  2. Syntax analysis, where these tokens will be arranged in a tree called Syntax tree with terminal and non-terminal nodes with the production rules provided. Collecting each and every Token generated from lexical analysis, Syntax analysis tries to validate the Syntax from it.

    NON-TERMINAL Node : Indicates other production rule.

    TERMINAL Node : Indicates the token or data node.

And here is the difference,

  1. After Successful Syntax verification we required a useful form to make use of it. More useful representation is the Tree Representation, we already have the Syntax tree Generated as a part of Syntax analysis which can be modified to get the useful tree out of it, this is where the JJTree come into picture to rename and create useful tree structure use #NODE_NAME syntax in production rules.

Edit for the Comment as below

Multiply(2) indicates only two Children this make sense if your operation is A*B, if you are performing A*B*C and with #Multiply(2) then the tree will be like

          Multiply
        /          \
  Multiply           C
    /  \
  A     B

if you are performing A*B*C and with #Multiply then the tree will be like

   Multiply    Multiply      Multiply
      |            |             | 
      A            B             C

Basically the difference between #Multiply and #Multiply(2) is Multiply(2) will wait for two tokens for the Node to be generated if found only one throws the exception and #Multiply will generate nodes as and when the Production rule got matched.

Upvotes: 1

Related Questions