Resolving messy precedence in Bison Parser

Question

I'm currently building a parser for my own language but I'm having trouble with precedence (I think that's the problem). I only discovered this issue while I was constructing the AST. I defined the following tokens:

%token T_SEMICOLON T_COMMA T_PERIOD T_RETURN T_BOOLEAN T_EXTENDS T_TRUE T_FALSE T_IF T_DO T_NEW T_ELSE T_EQUALSSIGN T_PRINT T_INTEGER T_NONE T_WHILE T_NUMBER T_VARIABLE T_LEFTCURLY T_RIGHTCURLY T_LEFTPAREN T_RIGHTPAREN T_ARROW T_EOF

%left T_OR
%left T_AND
%left T_MINUS T_EQUALSWORD T_PLUS 
%left T_GE T_GEQ
%left T_MULTIPLY T_DIVIDE
%right T_NOT

(only the ones with precedence really matter in this case) and I have a grammar that takes the form

Something: T_PRINT expression T_SEMICOLON

where I have several expression productions, some of them being

expression        : expression T_PLUS expression         {$$ = new PlusNode($1, $3);}                             
                  | expression T_MINUS expression        {$$ = new MinusNode($1, $3);} 
                  | expression T_EQUALSWORD expression   {$$ = new EqualNode($1, $3);}          
                  | expression T_MULTIPLY expression     {$$ = new TimesNode($1, $3);}           
                  | expression T_DIVIDE expression       {$$ = new DivideNode($1, $3);} 
                  | T_NOT expression                     {$$ = new NotNode($2);}                               
                  | T_MINUS expression                   {$$ = new NegationNode($2);}                
                  | T_VARIABLE                           {$$ = new VariableNode($1);} 
                  | T_NUMBER               {$$ = new IntegerLiteralNode($1);}                         
                  | T_TRUE                 {$$ = new BooleanLiteralNode(new IntegerNode(1));}                                      
                  | T_FALSE                {$$ = new BooleanLiteralNode(new IntegerNode(0));}                         
                  | T_NEW T_VARIABLE       {$$ = new NewNode($2, NULL);}

when I try to parse something like print true equals new c0 - new c0; I get a strange output AST that looks like Print(Minus(Equal(BooleanLiteral(1), New("c0")), New("c0"))). Here it looks like the T_EQUALSWORD token has higher precedence than the T_MINUS token even though I defined it otherwise?

This problem also occurs if I change the minus to a plus; inputting print true equals new c0 + new c0; I get Print(Equal(BooleanLiteral(1), GreaterEqual(New("c0"), New("c0")))) as output. The form seems correct but I for some reason am getting an T_GEQ token instead of an T_PLUS?

Interestingly, this parses correctly for T_MULTIPLY and T_DIVIDE instead:

Input: print true equals new c0 * new c0;

Output: Print(Equal(BooleanLiteral(1), Times(New("c0"), New("c0"))))

Input: print true equals new c0 / new c0;

Output: Print(Equal(BooleanLiteral(1), Divide(New("c0"), New("c0"))))

So this seems to work correctly with multiply and divide but is failing with plus and minus. Any ideas?

EDIT: I added %prec T_OR to the T_EQUALSWORD production and this solved the problem when I use subtraction, but I still get the weird T_GEQ token when I use addition.

Resolving messy precedence in Bison Parser

Answers (1)

Related Questions