DYZ
DYZ

Reputation: 57105

S/R conflict in Bison

I have a problem with a [relatively simple] grammar, a distilled version of a much bigger grammar. Bison reports one shift/reduce conflict, which is resolved incorrectly: the resulting parser fails to parse valid language. I am not kidding: I intermittently spent more than a year trying to eliminate the conflict but failed. ChatGPT is uncooperative :) Please advise.

%token YY_WORD_NAME
%left YY_IN 

%%

word_exp:
  word_exp YY_IN set_exp 
| YY_WORD_NAME 

set_exp:
  '{' word_exp '}' 
| '{' YY_WORD_NAME YY_IN set_exp ':' word_exp '}'

%%

Upvotes: 1

Views: 30

Answers (1)

Chris Dodd
Chris Dodd

Reputation: 126448

The basic problem you have here is lookahead -- in order to decide that a YY_WORD_NAME is not a word_exp, you need to look ahead through the following YY_IN set_expr (and unbounded number of tokens) to see the :.

The easiest fix for this is probably to "relax" the grammar a bit (accept some constructs that are not legal), and then have a post-pass that flags the erroneous conditions. If you use:

word_exp:
  word_exp YY_IN set_exp 
| YY_WORD_NAME 

set_exp:
  '{' word_exp '}' 
| '{' word_exp ':' word_exp '}'

That fixes the conflict, but accepts things that don't have exactly one YY_IN before the : like:

{ A in { B } in { C } : D }
{ A : B }

(here I'm assuming YY_IN is an in keyword and YY_WORD_NAME is just a non-keyword identifier)

You can even flag this as an error in the bison action immediately after the incorrect construct (rather than by building a data structure you later traverse, though you may be doing that anyway for other reasons)

%token YY_WORD_NAME
%left YY_IN 

%union {
    struct {
        int     incnt;
        // any other info needed from word_exp
    } word_exp;
}

%type<word_exp> word_exp

%%

word_exp:
  word_exp YY_IN set_exp { $$.incnt = $1.incnt + 1; }
| YY_WORD_NAME           { $$.incnt = 0; }

set_exp:
  '{' word_exp '}' 
| '{' word_exp ':' word_exp '}'  { if ($2.incnt != 1) { 
                                       yyerror("Syntax error");
                                       YYERROR; /* or YYABORT */
                                   }
                                 }

%%

Upvotes: 2

Related Questions