Salvatore
Salvatore

Reputation: 1185

Unable to fix shift/reduce conflicts in Bison grammar

I have the following Bison grammar:

%error-verbose
%{
#include "node.h"
NBlock *programBlock;

#define YYDEBUG 1

extern int yylex();
void yyerror(const char *s) { printf("Error: %s\n", s); }
%}

%union {
    Node *node;
    NBlock *block;
    NBody *body;
    NHeader *header;
    NExpression *expression;
    NStatement *statement;
    NIdentifier *identifier;
    NVariableDeclaration *variableDeclaration;
    NDoWhileStatement *doWhileStatement;
    NWhileStatement *whileStatement;
    NIfStatement *ifStatement;
    NForStatement *forStatement;
    std::vector<NVariableDeclaration*> *variableDeclarations;
    std::vector<NExpression*> *expressions;
    std::vector<NStatement*> *statements;
    std:string *string;
    int token;
}

/*
The %token directive is used to associate a type to a terminal symbol.
%token <type> 'terminal_list'
associates the specific type <type> to each terminal in 'terminal_list'.
The type <type> is the same used in the %union declaration
*/

%token <string> TIDENTIFIER TINTEGER TDOUBLE
%token <token> TCEQ TCNE TCLT TCLE TCGT TCGE TEQUAL
%token <token> TLPAREN TRPAREN TLBRACE TRBRACE TCOMMA TDOT
%token <token> TPLUS TMINUS TMUL TDIV TDO TDOUBLE_TYPE TINT_TYPE
%token <token> TELSE TFOR TIF TSEMICOLON TTHEN TWHILE

/*
The %type directive is used to associate a type to a nonterminal symbol.
%type <type> nonterminal_list
associates the specific type <type> to each nonterminal in 'nonterminal_list'.
The type <type> is the same used in the %union declaration
*/

%type <expression> expression term factor
%type <block> program body header tail statements
%type <statement> statement forStatement ifStatement doWhileStatement whileStatement variableDeclaration
%type <token> comparison
%type <string> identifier_type

/*
Operator precedence for mathematical operators
*/

%left TPLUS TMINUS
%left TMUL TDIV
%left TCEQ TCNE TCLT TCLE TCGT TCGE

/*
Start grammar symbol
*/

%start program

%%

program:                TLBRACE body TRBRACE { printf("Reduce body to program\n"); }
                        ;

body:                   header TLBRACE block TRBRACE tail { printf("Reduce header block tail to body\n"); }
                        ;

header:                 TLBRACE variableDeclarations TRBRACE { printf("Reduce variableDeclarations to header\n"); }
                        | TLBRACE TRBRACE { printf("Reduce empty to header\n"); }
                        ;

variableDeclarations:   variableDeclaration TSEMICOLON { printf("Reduce variable declaration to header\n"); }
                        | variableDeclarations variableDeclaration TSEMICOLON { printf("Reduce variable declaration list to header\n"); }
                        ;

tail:                   TLBRACE statements TRBRACE { printf("reduce statement list to tail\n"); }
                        | TLBRACE TRBRACE { printf("Reduce empty to tal\n"); }
                        ;                   

statements:             statement TSEMICOLON { printf("Reduce statement to statement list\n"); }
                        | statements statement TSEMICOLON { printf("Reduce statement list to statement list\n"); }
                        ;

statement:              ifStatement { printf("Reduce if to statement\n"); };
                        | forStatement { printf("Reduce for to statement\n"); };
                        | doWhileStatement { printf("Reduce doWhile to statement\n"); };
                        | whileStatement { printf("reduce while to statement\n"); }
                        | TIDENTIFIER TEQUAL expression { printf("Reduce assignment to expression\n"); }
                        ;

forStatement:           TFOR TLPAREN expression TSEMICOLON expression TSEMICOLON expression TRPAREN block { printf("Reduce for to for statement\n"); }
                        ;


ifStatement:            TIF expression TTHEN block { printf("Reduce if to if statement\n"); }
                        | TIF expression block TELSE block { printf("Reduce ifelse to if statement\n"); }
                        ;

doWhileStatement:       TDO block TWHILE expression { printf("reduce dowhile to while statement\n"); }
                        ;

whileStatement:         TWHILE block expression { printf("Reduce while to while statement\n"); }
                        ;

block:                  TLBRACE statements TRBRACE { printf("Reduce statement list to block\n"); }
                        | TLBRACE TRBRACE { printf("Reduce empty to block\n"); }
                        ;

variableDeclaration:    identifier_type TIDENTIFIER { printf("reduce uninitialized identifier to variable declaration\n"); }
                        | identifier_type TIDENTIFIER TEQUAL expression { printf("Reduce initialized identifier to variable declaration\n"); }
                        ;

identifier_type:        TINT_TYPE { printf("Reduce int to identifier type\n"); }
                        | TDOUBLE_TYPE { printf("Reduce double to identifier type\n"); }
                        | { printf("Reduce empty to identifier type\n"); }
                        ;



expression:             term { printf("Reduce term to expresson\n"); }
                        | expression comparison expression { printf("Reduce comparison to expression\n"); }
                        | expression TPLUS term { printf("Reduce addition to expression\n"); }
                        | expression  TMINUS term { printf("Reduce subtraction to expression\n"); }
                        ;

term:                   factor { printf("Reduce factor to term\n"); }
                        | term TMUL factor { printf("Reduce multiplication to term\n"); }
                        | term TDIV factor { printf("Reduce division to term\n"); }
                        ;

factor:                 TLPAREN expression TRPAREN { printf("Reduce nested expression to expression\n"); }
                        | TMINUS factor { printf("Reduce -factor to factor\n"); }
                        | TIDENTIFIER { printf("Reduce identifier to factor\n"); }
                        | TINTEGER { printf("Reduce integer to numeric\n"); }
                        | TDOUBLE { printf("Reduce double to numeric\n"); }
                        ;

comparison:             TCEQ { printf("Reduce eq to comparison\n"); }
                        | TCNE { printf("Reduce ne to comparison\n"); }
                        | TCLT { printf("Reduce lt to comparison\n"); }
                        | TCLE { printf("Reduce le to comparison\n"); }
                        | TCGT { printf("reduce gt to comparison\n"); }
                        | TCGE { printf("Reduce ge to comparison\n"); }
                        ; 

and I get 8 shift/reduce conflicts which I don't know how to fix.

The following is a piace of the parser.output file which I generated using the --report=all parameter. It is the state which is affected by the 8 shift/reduce conflicts:

State 79

   29 expression: expression . comparison expression
   29           | expression comparison expression .  [TCEQ, TCNE, TCLT, TCLE, TCGT, TCGE, TRPAREN, TLBRACE, TPLUS, TMINUS, TSEMICOLON, TTHEN]
   30           | expression . TPLUS term
   31           | expression . TMINUS term
   40 comparison: . TCEQ
   41           | . TCNE
   42           | . TCLT
   43           | . TCLE
   44           | . TCGT
   45           | . TCGE

    TCEQ    shift and go to state 56
    TCNE    shift and go to state 57
    TCLT    shift and go to state 58
    TCLE    shift and go to state 59
    TCGT    shift and go to state 60
    TCGE    shift and go to state 61
    TPLUS   shift and go to state 62
    TMINUS  shift and go to state 63

    TCEQ      [reduction with rule 29 (expression)]
    TCNE      [reduction with rule 29 (expression)]
    TCLT      [reduction with rule 29 (expression)]
    TCLE      [reduction with rule 29 (expression)]
    TCGT      [reduction with rule 29 (expression)]
    TCGE      [reduction with rule 29 (expression)]
    TPLUS     [reduction with rule 29 (expression)]
    TMINUS    [reduction with rule 29 (expression)]
    $default  reduction with rule 29 (expression)

    comparison  go to state 64

If I understand well the problem is that the parser does not know if going on reading another pice of text or reducing immediately the rule expression: expression comparison expression.

I would say that reducing immediately is correct. But if I this is correct, then how do I enforce immediate reduction insted of shifting?

Upvotes: 1

Views: 1801

Answers (3)

Salvatore
Salvatore

Reputation: 1185

Thanks for your answers. Here is the complete Bison grammar:

%error-verbose
%{
#include "node.h"
#include <iostream>
#include <string>
NBlock *programBlock;

#define YYDEBUG 1

extern int yylex();
void yyerror(const char *) { printf("Error: %s\n", s); }
%}

%union {
    Node *node;
    NBlock *block;
    NBody *body;
    NHeader *header;
    NExpression *expression;
    NStatement *statement;
    NIdentifier *identifier;
    NVariableDeclaration *variableDeclaration;
    NDoWhileStatement *doWhileStatement;
    NWhileStatement *whileStatement;
    NIfStatement *ifStatement;
    NForStatement *forStatement;
    std::vector<NVariableDeclaration*> *variableDeclarations;
    std::vector<NExpression*> *expressions;
    std::vector<NStatement*> *statements;
    std:string *string;
    int token;
}

%token <string> TIDENTIFIER TINTEGER TDOUBLE
%token <token> TCEQ TCNE TCLT TCLE TCGT TCGE TEQUAL
%token <token> TLPAREN TRPAREN TLBRACE TRBRACE TCOMMA TDOT
%token <token> TPLUS TMINUS TMUL TDIV TDO TDOUBLE_TYPE TINT_TYPE
%token <token> TELSE TFOR TIF TSEMICOLON TTHEN TWHILE TLSQUARE TRSQUARE

%type <expression> expression term factor
%type <block> program body header tail
%type <statements> statements
%type <statement> statement
%type <forStatement> forStatement
%type <ifStatement> ifStatement
%type <doWhileStatement> doWhileStatement
%type <whileStatement> whileStatement
%type <variableDeclaration> variableDeclaration
%type <token> comparison
%type <string> identifier_type
%type <variableDeclarations> variableDeclarations

/*
Start grammar symbol
*/

%start program

%%

program:                TLBRACE body TRBRACE { printf("Reduce body to program\n"); }
                        ;

body:                   header TLBRACE block TRBRACE tail { printf("Reduce header block tail to body\n"); }
                        ;

header:                 TLBRACE variableDeclarations TRBRACE { printf("Reduce variableDeclarations to header\n"); }
                        | TLBRACE TRBRACE { printf("Reduce empty to header\n"); }
                        ;

variableDeclarations:   variableDeclaration TSEMICOLON { printf("Reduce variable declaration to header\n"); }
                        | variableDeclarations variableDeclaration TSEMICOLON { printf("Reduce variable declaration list to header\n"); }
                        ;

tail:                   TLBRACE statements TRBRACE { printf("reduce statement list to tail\n"); }
                        | TLBRACE TRBRACE { printf("Reduce empty to tal\n"); }
                        ;                   

statements:             statement TSEMICOLON { printf("Reduce statement to statement list\n"); }
                        | statements statement TSEMICOLON { printf("Reduce statement list to statement list\n"); }
                        ;

statement:              ifStatement { printf("Reduce if to statement\n"); };
                        | forStatement { printf("Reduce for to statement\n"); };
                        | doWhileStatement { printf("Reduce doWhile to statement\n"); };
                        | whileStatement { printf("reduce while to statement\n"); }
                        | TIDENTIFIER TEQUAL expression { printf("Reduce assignment to expression\n"); }
                        | TIDENTIFIER TLSQUARE TINTEGER TRSQUARE TEQUAL expression { printf("Reduce array assignment to expression\n"); }
                        ;

forStatement:           TFOR TLPAREN expression TSEMICOLON expression TSEMICOLON expression TRPAREN block { printf("Reduce for to for statement\n"); }
                        ;


ifStatement:            TIF expression TTHEN block { printf("Reduce if to if statement\n"); }
                        | TIF expression TTHEN block TELSE block { printf("Reduce ifelse to if statement\n"); }
                        ;

doWhileStatement:       TDO block TWHILE expression { printf("reduce dowhile to while statement\n"); }
                        ;

whileStatement:         TWHILE block expression { printf("Reduce while to while statement\n"); }
                        ;

block:                  TLBRACE statements TRBRACE { printf("Reduce statement list to block\n"); }
                        | TLBRACE TRBRACE { printf("Reduce empty to block\n"); }
                        ;

variableDeclaration:    identifier_type TIDENTIFIER { printf("Reduce uninitialized identifier to variable declaration\n"); }
                        | identifier_type TIDENTIFIER TLSQUARE TINTEGER TRSQUARE { printf("Reduce  array to variable declaration\n"); }
                        | identifier_type TIDENTIFIER TEQUAL expression { printf("Reduce initialized identifier to variable declaration\n"); }
                        | identifier_type TIDENTIFIER TLSQUARE TINTEGER TRSQUARE TEQUAL expression { printf("Reduce initialized array to variable declaration\n"); }
                        ;

identifier_type:        TINT_TYPE { printf("Reduce int to identifier type\n"); }
                        | TDOUBLE_TYPE { printf("Reduce double to identifier type\n"); }
                        | { printf("Reduce empty to identifier type\n"); }
                        ;


expression:             add-expression { printf("Reduce add-expression to expression\n"); }
                        |add-expression comparison add-expression { printf("Reduce add-expression comparison add-expression to expression\n"); }


add-expression:         term { printf("Reduce term to expresson\n"); }
                        | add-expression TPLUS term { printf("Reduce addition to expression\n"); }
                        | add-expression  TMINUS term { printf("Reduce subtraction to expression\n"); }
                        ;

term:                   factor { printf("Reduce factor to term\n"); }
                        | term TMUL factor { printf("Reduce multiplication to term\n"); }
                        | term TDIV factor { printf("Reduce division to term\n"); }
                        ;

factor:                 TLPAREN expression TRPAREN { printf("Reduce nested expression to expression\n"); }
                        | TMINUS factor { printf("Reduce -factor to factor\n"); }
                        | TIDENTIFIER { printf("Reduce identifier to factor\n"); }
                        | TINTEGER { printf("Reduce integer to numeric\n"); }
                        | TDOUBLE { printf("Reduce double to numeric\n"); }
                        ;

comparison:             TCEQ { printf("Reduce eq to comparison\n"); }
                        | TCNE { printf("Reduce ne to comparison\n"); }
                        | TCLT { printf("Reduce lt to comparison\n"); }
                        | TCLE { printf("Reduce le to comparison\n"); }
                        | TCGT { printf("reduce gt to comparison\n"); }
                        | TCGE { printf("Reduce ge to comparison\n"); }
                        ; 

Upvotes: 0

Chris Dodd
Chris Dodd

Reputation: 126203

Your grammar is ambiguous -- an input like 1 < 2 < 3 can be parsed as either (1 < 2) < 3 or 1 < (2 < 3).

There are two ways to deal with this -- either add %left/%right/%nonassoc directives to use bison's internal precedence handling, or introduce additional levels of rules to handle it.

Now for your other operators (* / + -) you're doing BOTH of these -- this is generally a mistake and you want to do just one or the other. But if you do do both, the additional rules will take precedence and the precedence directives will be ignored and sometimes cause suprising problems.

The "normal" handling for relations like this is to say you can't have multiple of them (1 < 2 < 3 is a syntax error and should not be parsed either left or right recursive.) To do this with additional rules, you'd change you expression rule to:

expression:           add_expression
                    | add_expression comparison add_expression
                    ;
add_expression:       term
                    | add_expression TPLUS term
                    | add_expression TMINUS term
                    ;

to use precedence directives, get rid of term, factor, and comparison (moving them all into expression and add:

%nonassoc TCEQ TCNE TCLT TCLE TCGT TCGE
%left TPLUS TMINUS
%left TMUL TDIV

Upvotes: 5

Gene
Gene

Reputation: 46960

The problem is that associativity rules (like your %left directive) don't work through grammar indirection. If you do away with the comparison rule and pull all the comparison operators up into the expression rule, the problem goes away. (I verified this with Bison v2.4.1.)

This may be inconvienient because it makes expression more verbose than you'd like. A workaround is to push the decision back to the scanner. Define COMPARISON as a single token, and pass the flavor of comparison as a separate enumerated type through yylval so it's available as $n.

Upvotes: 1

Related Questions