Shift/Reduce Conflict when making parser with Python

Question

I have written a parser with sly (https://github.com/dabeaz/sly/), however it has two shift/reduce conflicts for no reason. How am I supposed to fix this?

parser.py

@_("NAME ASSIGN primary")
    def statement(self, p):
        self.variables[p[0]] = p[2]

    @_("primary")
    def statement(self, p):
        return p[0]



    @_("primary PLUS secundary")
    def primary(self, p):
        return p[0] + p[2]

    @_("primary MINUS secundary")
    def primary(self, p):
        return p[0] - p[2]

    @_("secundary")
    def primary(self, p):
        return p[0]





    @_("secundary MULTIPLY tertiary")
    def secundary(self, p):
        return p[0] * p[2]

    @_("secundary FLOORDIV tertiary")
    def secundary(self, p):      
        return p[0] // p[2] 

    @_("secundary DIVIDE tertiary")
    def secundary(self, p):
        return p[0] / p[2]

    @_("secundary MOD tertiary")
    def secundary(self, p):
        return p[0] % p[2]

    @_("tertiary")
    def secundary(self, p):
        return p[0]




    @_("quaternary POWER tertiary")
    def tertiary(self, p):
        return p[0] ** p[2]

    @_("quaternary")
    def tertiary(self, p):
        return p[0]



    @_("tertiary quinary")   
    def quaternary(self, p):      
        return p[0] * p[1]

    @_("quinary")
    def quaternary(self, p):
        return p[0]




    @_("INTEGER")
    def quinary(self, p):
        return p[0]

    @_("LPAREN primary RPAREN")
    def quinary(self, p):
        return p[1]

debug.out

state 27

    (12) tertiary -> quaternary POWER tertiary .
    (14) quaternary -> tertiary . quinary
    (15) quinary -> . LPAREN primary RPAREN
    (16) quinary -> . INTEGER
  ! shift/reduce conflict for LPAREN resolved as shift
  ! shift/reduce conflict for INTEGER resolved as shift
    MOD             reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    DIVIDE          reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    FLOORDIV        reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    MULTIPLY        reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    MINUS           reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    PLUS            reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    $end            reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    RPAREN          reduce using rule 12 (tertiary -> quaternary POWER tertiary .)
    LPAREN          shift and go to state 8
    INTEGER         shift and go to state 9

    quinary                        shift and go to state 17

I only added a part of the debug file but the entire grammar as I believe the others are not relevant, please tell me if I am supposed to post any more code.

I do not understand how there can be any shift/reduce conflicts in my grammar because there are no parts where my rules are ambiguous.

rici · Accepted Answer

Your grammar is certainly ambiguous, because 1 POWER 2 3 can be parsed as either (1 POWER 2) * 3 or 1 POWER (2 * 3). In order to fix the grammar, you will need to decide which of these two interpretations is what you want, and then change the grammar to allow only the correct interpretation.

As far as I know, there's no consensus about the precedence of the implicit multiplication operator. Many people believe that it should have exactly the same precedence and associativity as explicit multiplication. Others feel that 2a/7b should be interpreted as (2*a)/(7*b), rather than ((2*a)/7)*b, leading to the idea that implicit multiplication should bind more tightly.

Things get even more complicated when exponentiation is introduced. In mathematics, exponentiation is generally written in two dimensions, using typographic effects. That makes the grouping of exponents explicit. For the rest, the convention that exponentiation binds more tightly (even more tightly than unary negation) is unproblematic, and we have:

(1) a2^b => a * (2 POWER b)
(2) a^2b => a POWER (2 * b).
(3) a²b => (a POWER 2) * b.

Without the implicit grouping provided by writing the exponents as superscripts, the last two examples above become indistinguishable. If the exponent operator is ^, both of them would be written as a^2b, and the parser would need to decide which of the two possible interpretations to assign to the string.

The problem with wanting unparenthesised expressions a2^b and a^2b to be parsed as (1) and (2) is that the precedence order differs: in case (1), we exponentiation takes precedence, while to achieve case (2) implicit multiplication must take precedence. I believe that's what your grammar is trying to do, and it fails precisely because precedence inversion of this form is very tricky (and usually leads to ambiguities). Interpreting a^2b as (3) is much more straight-forward, since the precedence orders are the same.

So I'd recommend that you accept interpretation (3). One important principle of language design is that things which are difficult for computers to disambiguate are often difficult for humans to disambiguation, too, and it's often best to avoid difficult disambiguation rules. (As a motivating example, see the so-called Most Vexing Parse in C++. Or the common errors which are the reason many C and C++ compilers issue warnings about suggesting parentheses in perfectly legal expressions like a << n + 1.)

The implementation of interpretation (3) is extremely straight-forward, since it requires only the usual cascading of precedence groups:

# This version makes implicit multiplication bind more tightly
# than division. Otherwise, implicit multiplication would just be
# added to secondary.
@_("tertiary quaternary")   
def tertiary(self, p):      
    return p[0] * p[1]

@_("quaternary")
def tertiary(self, p):
    return p[0]

@_("quinary POWER quaternary"
def quaternary(self, p):
    return p[0] ** p[2]

@_("quinary")
def quaternary(self, p):
    return p[0]

Shift/Reduce Conflict when making parser with Python

Answers (1)

Related Questions