RazorMx
RazorMx

Reputation: 427

How to parse simple expressions?

My homework is to make an expression parser in Haskell. I couldn't find help for my case from previous questions about parsing in Haskell on SO.

The expression is defined as follow:

data Expr =
| Const Int
| Var String
| Add Expr Expr | Sub Expr Expr | Mul Expr Expr | Pow Expr Int
deriving Show

The type Parser is defined as follow:

type Parser a = String -> Maybe (a, String)

The parsers given in my homework are defined as follow:

--Primitive parsers

success :: a -> Parser a
success v = \inp -> Just (v, inp)

failure :: Parser a
failure = \_ -> Nothing

item :: Parser Char
item = \inp -> case inp of 
                "" -> Nothing
                (c : cs) -> Just (c, cs)

--Sequential parsers

(>>>) :: Parser a -> (a -> Parser b) -> Parser b
p >>> f = \inp -> case p inp of
                    Nothing -> Nothing
                    Just (x, cs) -> f x cs

(&&&) :: Parser a -> Parser b -> Parser b 
p &&& q = p >>> \_ -> q

-- Conditional Parsers

sat :: (Char -> Bool) -> Parser Char
sat f = item >>> \c -> if f c then success c else failure

digit :: Parser Char 
digit = sat isDigit

space :: Parser Char
space = sat isSpace

char :: Char -> Parser Char
char c = sat (== c)

-- Alternative parsers

(+++) :: Parser a -> Parser a -> Parser a
p +++ q = \inp -> case p inp of
                    Nothing -> q inp 
                    Just res -> Just res

-- Iterative Parsers

many :: Parser a -> Parser [a]
many p = many1 p +++ success []

many1 :: Parser a -> Parser [a]
many1 p = p >>> \v -> many p >>> \vs -> success (v : vs)

--Token

nat :: Parser Int
nat = many1 digit >>> \cs -> success (read cs)

spaces :: Parser ()
spaces = many space &&& success ()

token :: Parser a -> Parser a
token p = spaces &&& p

natural :: Parser Int
natural = token nat

symbol :: Char -> Parser Char
symbol c = token (char c)

Now I have to develop a function expr that takes a String and converts it into a Expr.

To do that in my homework I find already the function defined as follow:

expr :: Parser Expr
expr = term >>> \t -> (symbol '+' &&& expr >>> \e -> success $ Add t e) +++ (symbol '-' &&& expr >>> \e -> success $ Sub t e) +++ success t

I understand that I have to develop three functions: term, power and factor.

The function factor should parse a constant or a variable.

The function term should parse a single factor (for example 2) or a multiplication between two factors (for example 2*x).

I don't understand very well how monads work and I'm having problems to do my homework. Any help on how these functions should be written?

Here is the image I found in my homework.

Diagrams for the parsing of term factor and expressions.

Upvotes: 3

Views: 845

Answers (1)

Cirdec
Cirdec

Reputation: 24156

The various parsers you need to write are building the precedence rules for the order of operations of arithmetic into the structure of the parsers. In order from most to least precedence these are

  • parenthesis and atomic literals like numbers and variables, which you will call a factor
  • exponentiation, which you will call a power
  • multiplications, which you will call a term
  • addition and subtraction, which you will call an expr.

You are starting on the outside with the operators with the lowest precedence and working in to the operators with the highest precedence. Moving in to multiplication, you can write term the same way you wrote expr. An expr consisted of terms; a term will consist of powers

term :: Parser Expr
term = power >>> \p -> (symbol '*' &&& term >>> \e -> success $ Mul p e) +++ success p

In the same way, a power will consist of factors. Since the right hand side of a Pow is an Int, it just uses a natural number nat instead of recursing.

power :: Parser Expr
power = factor >>> f -> (symbol '^' &&& nat >>> \n -> success $ Pow f n) +++ success f

I'll leave writing factor for you; the parts I wrote are almost the same as expr, which you already figured out. The parenthesis inside a factor will wrap an arbitrary expression expr, starting over again at the outside at the least precedence. This will allow you to parse things like "2*(x+1)"

Upvotes: 2

Related Questions