How to make the Parsec chainl1 function follow operator precedence rules

Question

I'm programming a standard math notation -> DC POSIX-compliant format converter. It takes the input string, parses it into an intermediary data type and then turns it into the output string by showing it.

This is the Data type used. I have no problems with the Data type -> Output String conversion, it works flawlessly:

data Expression = Expression :+ Expression
                | Expression :- Expression
                | Expression :* Expression
                | Expression :/ Expression
                | Expression :^ Expression
                | Cons String

infixr 0 :+
infixr 0 :-
infixr 1 :*
infixr 1 :/
infixr 2 :^

instance Show Expression where
  show (x :+ y) = unwords [show x, show y, "+"]
  show (x :- y) = unwords [show x, show y, "-"] 
  show (x :* y) = unwords [show x, show y, "*"]
  show (x :/ y) = unwords [show x, show y, "/"]
  show (x :^ y) = unwords [show x, show y, "^"]
  show (Cons y) = y

The Parsec parser part, however, refuses to comply with the defined operator precedency rules. Clearly because of the way chainl1 is used in the subexpression parser definition:

expression :: Parser Expression
expression = do
  spaces
  x <- subexpression
  spaces >> eof >> return x

subexpression :: Parser Expression
subexpression = (
    (bracketed subexpression) <|>
    constant
  ) `chainl1` (
    try addition              <|>
    try substraction          <|>
    try multiplication        <|>
    try division              <|>
    try exponentiation
  )

addition       = operator '+' (:+)
substraction   = operator '-' (:-)
multiplication = operator '*' (:*)
division       = operator '/' (:/)
exponentiation = operator '^' (:^)

operator :: Char -> (a -> a -> a) -> Parser (a -> a -> a)
operator c op = do
  spaces >> char c >> spaces
  return op

bracketed :: Parser a -> Parser a
bracketed parser = do
  char '('
  x <- parser
  char ')'
  return x

constant :: Parser Expression
constant = do
  parity <- optionMaybe $ oneOf "-+"
  constant <- many1 (digit <|> char '.')
  return (if parity == Just '-'
    then (Cons $ '_':constant)
    else  Cons       constant)

Is there a way of making the parser take into account the operator precedence rules without having to rewrite the entirety of my code?

Daniel Fischer · Accepted Answer

Well, you don't need to rewrite your entire code, but since your subexpression parser doesn't take precedence into account at all, you have to rewrite that - substantially.

One possibility is to build it from parsers for subexpressions with top-level operators with the same precedence,

atom :: Parser Expression
atom = bracketed subexpression <|> constant

-- highest precedence operator is exponentiation, usually that's
-- right-associative, hence I use chainr1 here
powers :: Parser Expression
powers = atom `chainr1` try exponentiation

-- a multiplicative expression is a product or quotient of powers,
-- left-associative
multis :: Parser Expression
multis = powers `chainl1` (try multiplication <|> try division)

-- a subexpression is a sum (or difference) of multiplicative expressions
subexpression :: Parser Expression
subexpression = multis `chainl1` (try addition <|> try substraction)

Another option is to let the precedence and associativities be taken care of by the library and use Text.Parsec.Expr, namely buildExpressionParser:

table = [ [binary "^" (:^) AssocRight]
        , [binary "*" (:*) AssocLeft, binary "/" (:/) AssocLeft]
        , [binary "+" (:+) AssocLeft, binary "-" (:-) AssocLeft]
        ]

binary  name fun assoc = Infix (do{ string name; spaces; return fun }) assoc

subexpression = buildExpressionParser table atom

(which requires that bracketed parser and constant consume the spaces after the used tokens).

How to make the Parsec chainl1 function follow operator precedence rules

Answers (1)

Related Questions