Lightsong
Lightsong

Reputation: 345

Troubles using Bison's recursive rules, and storing values using it (again)

For reference: this is basically a followup question of Troubles using Bison's recursive rules, and storing values using it.

I am trying to make a flex+bison scanner and parser for Newick file format trees in order to do operations on them. The implemented grammar is based on a simplification of this example.

This is esentially a parser for a file format which represents a tree with a series of (recursive) subtrees and/or leaves. The main tree will always end on ; and said tree and all subtrees within will contain a series of nodes between ( and ), with a name and a length to the right of the rightmost parenthesis specified by name and :length, which are optional (you can avoid specifying them, put one of them (name or :length), or both with name:length).

If any node lacks either the name or a length, default values will be applied. (for example: 'missingName' and '1.0')

An example would be (child1:4, child2:6)root:6; , ((child1Of1:2, child2Of1:9)child1:5, child2:6)root:6;

The implementation of said grammar is the following one (NOTE: I translated my own code, as it was in my language, and lots of side stuff (auxiliary functions, etc.) got removed for clarity...):

EDIT: Now I am not (hopefully) evaluating uninitialized variables.

%union {
        struct node {
            char* name; /*the node's assigned name, either from the file or from default values*/
            float length; /*node's length*/
            float lengthInteresting; /*maxium of the distances from this node to a leaf with an interesting string*/
            int isInteresting; /*Does this node contain an interesting string? (1=true,0=false)*/
        } dataOfNode;
}

 

%start tree

%token OP CP COMMA SEMICOLON COLON
 
%token<dataOfNode> DISTANCE NAME

%type<dataOfNode> tree subtrees recursive_subtrees subtree info


%%

tree:   subtrees info SEMICOLON                         {
                                                            $$.name = $2.name;
                                                            $$.length = $1.length + $2.length;
                                                            $$.lengthInteresting = $1.lengthInteresting;
                                                            
                                                            printf("max length is %.2f, max length of those who are interesting is %.2f", $$.length, $$.lengthInteresting);
                                                        }
    
    
subtrees:   OP recursive_subtrees CP                    {
                                                            $$ = $2;                                                        
                                                        }
            ;   
        
            
recursive_subtrees: subtree                             {   
                                                            $$ = $1;                                           
                                                        } // base case for a list of recursive trees     
                    | recursive_subtrees COMMA subtree  {
                                                            $$ = $1;
                                                            
                                                            if ($$.length < $3.length) {
                                                                $$.length = $3.length;
                                                            }  
                                                            
                                                            if ($$.lengthInteresting < $3.lengthInteresting) {
                                                                $$.lengthInteresting = $3.lengthInteresting;
                                                            }
                                                        } // subtree, subtree, subtree...      
    

subtree:    subtrees info                   {   $$ = $2;
                                                $$.length = $$.length + $1.length;
                                                $$.lengthInteresting = $$.lengthInteresting + $1.lengthInteresting; // if it has any interesting node, $1 will have some length value
                                            }                         
            
            | info                          {   
                                                $$ = $1;  
                                            }       // a leaf
                    
                    
                    
info:   NAME COLON DISTANCE {   $$.name= $$.name; $$.length = $3.length;
                                
                                if($$.name contains interesting string) {       // it cointains an interesting string
                                    $$.isInteresting = 1;
                                    $$.lengthInteresting = $3.length;
                                } else {                                        // ...it doesn't
                                    $$.isInteresting = 0;
                                    $$.lengthInteresting = 0.0;
                                }
                            }       // with name and distance
        
        | NAME              {   $$.name= $1.name; $$.length = 1.0;
                                
                                if($$.name contains interesting string) {       // it cointains an interesting string
                                    $$.isInteresting = 1;
                                    $$.lengthInteresting = 1.0;
                                } else {                                        // ...it doesn't
                                    $$.isInteresting = 0;
                                    $$.lengthInteresting = 0.0;
                                }
                            }             // without distance
        
        | COLON DISTANCE    {   $$.name= "missingName"; $$.length = $2.length;} // without name
        
        |                   {   $$.name= "missingName"; $$.length = 1.0;}       // without name nor distance
        
        ;


%%

With the assistance I had on my previous question, I have implemented a way of getting the maximum distance/length of the tree's root to a leaf without reconstructing the entire tree's structure. This is done by replacing lenghts recursively when bison is going backwards.

I am trying to do now something similar: get the maximum distance from the root to a leaf which has a specific name I may find interesting (knowing that there may be multiple leaves with said name, I need to do something similar as I did with the maximum length)

However, no matter what I do, I don't get the maximum distance from the root to any of the interesting nodes correctly, whereas the maximum length to any node is displayed correctly. I suspect that my reasoning for getting/replacing/checking the values recursively is still wrong.

Upvotes: 0

Views: 143

Answers (1)

Chris Dodd
Chris Dodd

Reputation: 126408

You have rule+actions that look like:

recursive_subtrees: subtree {   
                              if ($$.length < $1.length) { 

where you are reading the value of $$ (specifically the length field) before ever storing anything in it. So you're essentially getting uninitialized garbage and comparing that to the length of the subtree, which is not going to do anything useful. There are quite a few actions where you do this (using $$ before ever assigning anything to it).

This may indicate a fundamental misunderstanding of how actions work in a bison parser -- actions run when each rule is "reduced" and rules are reduced "bottom up"1, starting from the leaves of the parse tree and ending at the root.

What are you trying to do here?


1Its always confusing that trees are upside down in computer science, with the root at the top and leaves at the bottom

Upvotes: 2

Related Questions