Maytweeer
Maytweeer

Reputation: 1

Yacc conflict i cant fix

i've been trying to fix a shift/reduce conflict in my yacc specification and i can't seem to find where it is.

%union{
   char* valueBase;
   char* correspondencia;
}
%token pal palT palC
%type <valueBase> pal  
%type <correspondencia> palT palC Smth

%%

Dicionario  : Traducao
            | Dicionario Traducao
            ;

Traducao    : Palavra Correspondencia
            ;
Palavra : Base Delim
        | Exp
        ;

Delim : 
      | ':'
      ;

Correspondencia :
                | palC                  {printf("PT Tradução: %s\n",$1);}
                ;

Exp : Smth '-' Smth                  {aux = yylval.valueBase; printf("PT Tradução: %s %s %s\n", $1, aux, $3);}
    ;

Smth : palT                         {$$ = strdup($1);}
     |                              {$$ = "";}
     ;

Base    : pal                   {printf("EN Palavra base: %s\n",$1);}
        ;

Any help to find and fix this conflict would be extremely appreciated.

Upvotes: 0

Views: 37

Answers (1)

Chris Dodd
Chris Dodd

Reputation: 126408

So looking at the y.output file from your grammar, you have a shift/reduce conflict in state 13:

State 13

   10 Exp: Smth '-' . Smth

    palT  shift, and go to state 2

    palT      [reduce using rule 12 (Smth)]
    $default  reduce using rule 12 (Smth)

    Smth  go to state 16

Basically, what this is saying is that when parsing an Exp after having seen a Smth '-' and looking at a lookahead of palT, it doesn't know whether it should reduce an empty Smth to finish the Exp (leaving the palT as part of some later construct) OR shift the palT so it can then be reduced (recognized) as a Smth that completes this Exp.

The language you are recognizing is a sequence of one or more Traducao, each of which consists of a Palavra followed by an optional palC (Correspondencia that may be a palC or empty). That means that you might have a Palavra directly following another Palavra (the Correspondencia for the first one is empty). So the parser needs to find the boundary between one Palavra and the next just by looking at its current state and one token of lookahead, which is a problem.

In particular, when you have an input like PalT '-' PalT '-' PalT, that is two consecutive Palavra, but it is not clear whether the middle PalT belongs to the first one or the second. It is ambiguous, because it could be parsed successfully either way.

If you want the parser to just accept as much as possible into the first Palavra, then you can just accept the default resolution (of shift). If that is wrong and you would want the other interpretation, then you are going to need more lookahead to recognize this case, as it depends on whether or not there is a second '-' after the second palT or something else.

Upvotes: 2

Related Questions