Reputation: 9465
Below is a very simplified XML grammar for Bison:
head : NODE_START NAME atts
| NODE_START NAME
;
element : head NODE_CLOSE NODE_END
| head NODE_END anys NODE_START NODE_CLOSE NAME NODE_END
| head NODE_END NODE_START NODE_CLOSE NAME NODE_END
;
text : TEXT
;
comment : NODE_START COMMENT_START COMMENT_END NODE_END
;
cdata : NODE_START CDATA_START CDATA_END NODE_END
;
attr : NAME EQUALS value
;
value : QUOTED
| APOSED
;
atts : attr atts
| attr
elt : element
| comment
| cdata
any : elt
| text
;
elts : elt elts
| elt
;
anys : text elts anys
| elts
| text
;
s : any
| PROLOG any
;
The alleged conflict is the rule anys -> text
.
When I look at the corresponding output:
State 35
21 anys: text elts . anys
NODE_START shift, and go to state 1
TEXT shift, and go to state 2
head go to state 4
element go to state 5
text go to state 25
comment go to state 7
cdata go to state 8
elt go to state 26
elts go to state 27
anys go to state 42
How do I understand what is at conflict here?
Upvotes: 0
Views: 52
Reputation: 241721
If you look at the beginning of the .output file, you will see the following:
Rules useless in parser due to conflicts
23 anys: text
State 26 conflicts: 1 shift/reduce
State 27 conflicts: 1 shift/reduce
The first warning tells you that the production anys: text
was eliminated altogether because the resolution of parsing conflicts (elsewhere in the grammar) made it impossible for the rule to ever be used. (Thus, it is "useless".) The next two lines tell you where to find the conflicts: in states 26 and 27.
So the rule you quote is not the "alleged conflict" and the state you quote has nothing to do with conflicts (indeed, I have no idea why you focused on it.)
In the states with conflicts, you will see, for example:
State 26
21 anys: text . elts anys
23 | text .
NODE_START shift, and go to state 1
NODE_START [reduce using rule 23 (anys)]
head go to state 4
element go to state 5
comment go to state 7
cdata go to state 8
elt go to state 27
elts go to state 35
The conflict is indicated by a lookahead (in this case NODE_START
) with two or more different actions. The action(s) enclosed in brackets (in this case [reduce using rule 23 (anys)]
) were eliminated by bison's conflict resolution mechanism (which, in the absence of precedence declarations, chooses the shift action if there is one, and otherwise the reduce action with the smallest production number).
The state dumps should make it clear why the rule anys: text
became useless. In both cases where it could be reduced, there was a shift-reduce conflict and the shift action was preferred.
The problem is anys: text elts anys
. Consider an input consisting of three elt
s. This could be parsed as an elts
consisting of two elt
s followed by an elts
consisting of a single elt
, or vice versa. The ambiguity causes a shift-reduce conflict.
Another problem with that production is that it does not permit an elts
to end with a text
(unless it consists only of a single text
.
A better definition would be the simple
anys: any | anys any
Note: you are using a bottom-up parser and right recursion is (literally) an anti-pattern. Writing your lists left-recursively as above will limit parser stack usage and cause senantic actions to run in the expected order (that is, left to right). Unless you have very specific needs, you should avoid right recursion.
Upvotes: 1