wvxvw
wvxvw

Reputation: 9465

Explain this shift-reduce conflict

Below is a very simplified XML grammar for Bison:

head    : NODE_START NAME atts
        | NODE_START NAME
        ;
element : head NODE_CLOSE NODE_END
        | head NODE_END anys NODE_START NODE_CLOSE NAME NODE_END
        | head NODE_END NODE_START NODE_CLOSE NAME NODE_END
        ;
text    : TEXT
        ;
comment : NODE_START COMMENT_START COMMENT_END NODE_END
        ;
cdata   : NODE_START CDATA_START CDATA_END NODE_END
        ;
attr    : NAME EQUALS value
        ;
value   : QUOTED
        | APOSED
        ;
atts    : attr atts
        | attr
elt     : element
        | comment
        | cdata
any     : elt
        | text
        ;
elts    : elt elts
        | elt
        ;
anys    : text elts anys
        | elts
        | text
        ;
s       : any
        | PROLOG any
        ;

The alleged conflict is the rule anys -> text.

When I look at the corresponding output:

State 35

   21 anys: text elts . anys

    NODE_START  shift, and go to state 1
    TEXT        shift, and go to state 2

    head     go to state 4
    element  go to state 5
    text     go to state 25
    comment  go to state 7
    cdata    go to state 8
    elt      go to state 26
    elts     go to state 27
    anys     go to state 42

How do I understand what is at conflict here?

Upvotes: 0

Views: 52

Answers (1)

rici
rici

Reputation: 241721

1. Interpreting the dump file

If you look at the beginning of the .output file, you will see the following:

Rules useless in parser due to conflicts

   23 anys: text


State 26 conflicts: 1 shift/reduce
State 27 conflicts: 1 shift/reduce

The first warning tells you that the production anys: text was eliminated altogether because the resolution of parsing conflicts (elsewhere in the grammar) made it impossible for the rule to ever be used. (Thus, it is "useless".) The next two lines tell you where to find the conflicts: in states 26 and 27.

So the rule you quote is not the "alleged conflict" and the state you quote has nothing to do with conflicts (indeed, I have no idea why you focused on it.)

In the states with conflicts, you will see, for example:

State 26

   21 anys: text . elts anys
   23     | text .

    NODE_START  shift, and go to state 1

    NODE_START  [reduce using rule 23 (anys)]

    head     go to state 4
    element  go to state 5
    comment  go to state 7
    cdata    go to state 8
    elt      go to state 27
    elts     go to state 35

The conflict is indicated by a lookahead (in this case NODE_START) with two or more different actions. The action(s) enclosed in brackets (in this case [reduce using rule 23 (anys)]) were eliminated by bison's conflict resolution mechanism (which, in the absence of precedence declarations, chooses the shift action if there is one, and otherwise the reduce action with the smallest production number).

The state dumps should make it clear why the rule anys: text became useless. In both cases where it could be reduced, there was a shift-reduce conflict and the shift action was preferred.

2. Cause of the shift-reduce conflict

The problem is anys: text elts anys. Consider an input consisting of three elts. This could be parsed as an elts consisting of two elts followed by an elts consisting of a single elt, or vice versa. The ambiguity causes a shift-reduce conflict.

Another problem with that production is that it does not permit an elts to end with a text (unless it consists only of a single text.

A better definition would be the simple

anys: any | anys any

Note: you are using a bottom-up parser and right recursion is (literally) an anti-pattern. Writing your lists left-recursively as above will limit parser stack usage and cause senantic actions to run in the expected order (that is, left to right). Unless you have very specific needs, you should avoid right recursion.

Upvotes: 1

Related Questions