Haskell laziness question or why this monad not working as expected

Question

here is kinda huge piece of code, it finally got huge, because of monadic stuff, but task is simple: parse following string to data structure:

"hello(some, args)" -> [("fid","hello"),("sym","("),("args","some, args"),("sym",")")]

but code i written produces following:

"hello(some, args)" -> [("fid",""),("sym","("),("args",""),("sym",")")]

as you can see 'args' and 'fid' values are lost somewhere on the way, i suspect compiler decided not to calculate them for some mysterious reason.

I guess the code is completely bad, also i marked with "?"'s parts which seem useless to me, but compiler forced me to leave them in place :)

And here is the code:

type PStream = String
type PToken a = (String, a)
data Pstate a = Pstate (String -> ([PToken String], PStream)) a

instance Monad Pstate where
    return x = Pstate (\_ -> ([("start", "")], "?")) x -- not used ?
    (Pstate bindparser v) >>= f  = Pstate newparser fv
        where
            Pstate fparser fv = f v
            (btok, brest) = bindparser "this string also not used"
            (tok, rest) = fparser brest
            newparser _ = (btok ++ tok, rest)

-- parsers
parseFid :: Pstate String
parseFid = Pstate parser "???"
    where parser r = let (fid, rest) = span (/= '(') r in ([("fid", fid)],rest)

parseSym :: Char -> Pstate String
parseSym c = Pstate parser "???"
    where parser r = let rest = parseOne c r in ([("sym", [c])],rest)

parseOne s (h:t) = if h == s then t else error $ "symbol not match:" ++ [h] ++ " /= " ++ [s]
parseOne s []    = []

parseArgs :: Pstate String
parseArgs = Pstate parser "???"
    where parser r = let (args,rest) = span (/=')') r in ([("args", args)],rest)

-- util
load :: String -> Pstate String
load s = Pstate (\ls -> ([("load", "")],ls)) "???"

runP :: Pstate String -> ([PToken String], PStream)
runP (Pstate fparser fv) = fparser "???"

-- combined parser
parseFunction :: String -> Pstate String
parseFunction s = do
    load s --- should be 'return' here ?
    parseFid
    parseSym '('
    parseArgs
    parseSym ')'

main = putStrLn $ show $ runP $ parseFunction "hello(a b c)"

fuz · Accepted Answer

First, about the "???" you had to leave there. Consider your definition of Pstate:

data Pstate a = Pstate (String -> ([PToken String], PStream)) a

This means, that your data constructor has the following type:

Pstate :: (String -> ([PToken String], PStream)) -> a -> Pstate a

This is the default construct of a monad. If you define monadic combinators, it's actually not uncommon to have some combinators where this is not needed, so the convention is to leave it to () in this case.

But actually I think that your code is very strange, it seems like you didn't grabbed the point of a stateful monad. Let me explain:

Usually, a stateful computation has this type:

data MyState a = MyState (TypeOfState -> (a, TypeOfState))

This means, that your monadic action is actually some kind of computation, that does something (possible with your piece of state) and than returns a result and a new state. The state is wrapped up in the monad, so you don't have to think about it.

In your code, you're using the same pattern, but somewhat different. It seems like that you fixed the result of the computation to [PToken String]. Let me fix up your definition a bit:

data Pstate a = Pstate (PStream -> (a, PStream))

So now, you get the return value of your computation by applying the combinators, which look like this:

instance Monad Pstate where
  -- return should just wrap the computation up, so no changes
  return x = Pstate (\p -> (x,p))
  parser1 >>= parser2  = Pstate $ \input -> let
    Pstate parser1' = parser1
    Pstate parser2' = parser2
    (output, rest) = parser1' input
    result = parser2' output rest
    in result

Now, you can look at the type signatures for your parsers, they should be something like this: parseFid :: Pstate [PToken PStream]. This means, your parser consumes some input and returns the parsed stuff as [PToken PStream] and sets the new input to what is left. Consider this definition of parseFid about how it could look like:

parseFid :: Pstate [PToken PStream]
parseFid = Pstate $ 
 -> let (fid, rest) = span (/= '(') r in ([("fid", fid)],rest)

The rest is left as an exercise to the reader. I would suggest you to reformulate your parser using the State monad from Control.Monad.State.Strict instead. You'll see, that the monad above is basically the same.

Actually, it's most times easier to rely on existing and well known tools, instead of rolling down an own parser. Here's a parser for what you need created with Parsec, a state of the art library for parsing:

import Text.Parsec

parseFunction = do name   <- parseName
                   obrace <- parseOpeningBrace
                   args   <- parseArguments
                   cbrace <- parseClosingBrace
                   return [name,obrace,args,cbrace]

parseName         = many (noneOf "(") >>= 
ame -> return ("fid",name)
parseOpeningBrace = char '(' >> return ("sym","(")
parseArguments    = many (noneOf ")") >>= 
ame -> return ("args",name)
parseClosingBrace = char ')' >> return ("sym",")")

main = case parse parseFunction "" "hello(some, args)" of
  Left error   -> print error
  Right result -> print result

Here's the output:

[("fid","hello"),("sym","("),("args","some, args"),("sym",")")]

I actually would suggest you to think of some better representation of the parsed function, this may make things easier.

Haskell laziness question or why this monad not working as expected

Answers (2)

Related Questions