sinoTrinity
sinoTrinity

Reputation: 1195

Operator precedence issue when parsing with Megaparsec

I was parsing a C-like language with array and struct. Following C operator precedence, . and [] are made of equal precedence.

opTable :: [[Operator Parser Expr]]
opTable = [[ InfixL $ Access <$ symbol "." , opSubscript]]

opSubscript = Postfix $ foldr1 (.) <$> some singleIndex
singleIndex = do
    index < brackets expr
    return $ \l -> ArrayIndex l index

When parsing

Struct S {
  int[3] a;
}
Struct S s;
s.a[1]

it yielded Access (Var "s") (ArrayIndex (Var "a") 1) instead of ArrayIndex (Access (Var "s") (Var "a")) 1 Why? Is it because [] is not parsed as InfixL?

Update: After changing it to

opTable :: [[Operator Parser Expr]]
opTable = [[ PostFix $ (\ident expr -> Access expr ident) <$ symbol "." <*> identifier, opSubscript]]

I got another error

s.a[1]
|  ^
unexpected '['
expecting ')', '_', alphanumeric character, or operator

Upvotes: 1

Views: 278

Answers (1)

K. A. Buhr
K. A. Buhr

Reputation: 50819

The documentation for makeExprParser from parser-combinators is terrible with respect to prefix and postfix operators.

First, it fails to explain that with a mixture of prefix/postfix/infix operators at the supposed "same" precedence level, the prefix/postfix operators are always treated as higher precedence than the infix operators.

Second, when it makes the claims that "prefix and postfix operators of the same precedence can only occur once" and then gives --2 as an example for prefix operator -, it actually means that even two separate prefix operators (or two separate postfix operators) aren't allowed, so +-2 with separate prefix operators + and - isn't allowed either. What is allowed is a single prefix operator and a single postfix operator, at the same level, in which case the association is to the left, so -2! is okay (assuming - and ! are prefix and postfix operators at the same precedence level) and is parsed as (-2)!.

Oh, and third, the documentation never makes it clear that the example code for manyUnaryOp only works correctly for multiple prefix operators, and a non-obvious change is needed to get multiple postfix operators in the right order.

So, your first attempt doesn't work because the postfix operator is of secretly higher precedence than the infix operator. Your second attempt doesn't work because two different postfix operators at the same precedence level can't be parsed.

Your best bet is to parse single "postfix operator" consisting of a chain of access and index operations. Note the need for flip to get the ordering right for postfix operators.

opTable :: [[Operator Parser Expr]]
opTable = [[ indexAccessChain ]]

indexAccessChain = Postfix $ foldr1 (flip (.)) <$> some (singleIndex <|> singleAccess)

singleIndex = flip ArrayIndex <$> brackets expr
singleAccess = flip Access <$> (char '.' *> identifier)

A self-contained example:

{-# OPTIONS_GHC -Wall #-}
module Operators where

import Text.Megaparsec
import Text.Megaparsec.Char
import Control.Monad.Combinators.Expr
import Data.Void

type Parser = Parsec Void String

data Expr
  = Access Expr String
  | ArrayIndex Expr Expr
  | Var String
  | Lit Int
  deriving (Show)

expr :: Parser Expr
expr = makeExprParser term opTable

identifier :: Parser String
identifier = some letterChar

term :: Parser Expr
term = Var <$> identifier
  <|>  Lit . read <$> some digitChar

opTable :: [[Operator Parser Expr]]
opTable = [[ indexAccessChain ]]

indexAccessChain :: Operator Parser Expr
indexAccessChain = Postfix $ foldr1 (flip (.)) <$> some (singleIndex <|> singleAccess)

singleIndex, singleAccess :: Parser (Expr -> Expr)
singleIndex = flip ArrayIndex <$> brackets expr
singleAccess = flip Access <$> (char '.' *> identifier)

brackets :: Parser a -> Parser a
brackets = between (char '[') (char ']')

main :: IO ()
main = parseTest expr "s.a[1][2][3].b.c[4][5][6]"

Upvotes: 2

Related Questions