How do I parse infix instead of prefix with Haskell?

Question

I need help with this program I am trying to write in Haskell. I have written most of it, and here is what I am basically trying to do:

When I write

parse "a + b"

in the terminal I want this as output:

Plus (Word "a") (Word "b")

When I write

parse "a - 2 * b + c"

in the terminal I want this as output:

Minus (Word "a") (Plus (Mult (Num 2) (Word "b")) (Word "c"))

My code so far:

data Ast
    = Word String
    | Num Int
    | Mult Ast Ast
    | Plus Ast Ast
    | Minus Ast Ast
    deriving (Eq, Show)

tokenize :: [Char] -> [String]
tokenize [] = []
tokenize (' ' : s) = tokenize s
tokenize ('+' : s) = "+" : tokenize s
tokenize ('*' : s) = "*" : tokenize s
tokenize (c : s)
  | isDigit c =
    let (cs, s') = collectWhile isDigit s
     in (c : cs) : tokenize s'
  | isAlpha c =
    let (cs, s') = collectWhile isAlpha s
     in (c : cs) : tokenize s'
  | otherwise = error ("unexpected character " ++ show c)

collectWhile :: (Char -> Bool) -> String -> (String, String)
collectWhile p s = (takeWhile p s, dropWhile p s)

isDigit, isAlpha :: Char -> Bool
isDigit c = c `elem` ['0' .. '9']
isAlpha c = c `elem` ['a' .. 'z'] ++ ['A' .. 'Z']

parseU :: [String] -> (Ast, [String])
parseU ("+" : s0) =
  let (e1, s1) = parseU s0
      (e2, s2) = parseU s1
   in (Plus e1 e2, s2)
parseU ("*" : s0) =
  let (e1, s1) = parseU s0
      (e2, s2) = parseU s1
   in (Mult e1 e2, s2)
parseU (t : ts)
  | isNumToken t = (Num (read t), ts)
  | isWordToken t = (Word t, ts)
  | otherwise = error ("unrecognized token " ++ show t)
parseU [] = error "unexpected end of input"

isNumToken, isWordToken :: String -> Bool
isNumToken xs = takeWhile isDigit xs == xs
isWordToken xs = takeWhile isAlpha xs == xs

parse :: String -> Ast
parse s =
  case parseU (tokenize s) of
    (e, []) -> e
    (_, t : _) -> error ("unexpected token " ++ show t)

inn :: Ast -> String
inn (Plus x y) = innP x ++ " + " ++ innP y
inn (Mult x y) = innP x ++ " * " ++ innP y
inn ast = innP ast

innP :: Ast -> String
innP (Num n) = show n
innP (Plus x y) = "(" ++ innP x ++ " + " ++ innP y ++ ")"
innP (Mult x y) = "(" ++ innP x ++ " * " ++ innP y ++ ")"
innP (Word w) = w -- 

innfiks :: String -> String
innfiks s = inn (parse s)

Right now I get an error posting the text I wrote in the terminal, but when I write it like this:

parse "+ a b"

I get the correct output:

Plus (Word "a") (Word "b")

I know I have to change the code so it accepts what I send to the parse function on this form:

value operator value,

and not on this form:

operator value value

But im struggling to find out what and where I have to do this change.

How do I parse infix instead of prefix with Haskell?

Answers (1)

Related Questions