Mihail Feraru
Mihail Feraru

Reputation: 1479

Monadic Parser - handling string with one character

I was reading this Monadic Parsing article while I was trying to implement a pretty simple string parser in Haskell and also get a better understanding of using monads. Down below you can see my code, implementing functions for matching a single character or a whole string. It works as expected, but I observed two strange behaviors that I can't explain.

  1. I have to handle single characters in string, otherwise, the parser will return only empty lists. To be exact, if I remove this line string [c] = do char c; return [c] it won't work anymore. I was expecting that string (c:s) would handle string (c:[]) properly. What could be the cause here?

  2. In my opinion, string definition should be equivalent to string s = mapM char s as it would create a list of [Parser Char] for each character in s and collect the results as Parser [Char]. If I use the definition based on mapM, the program would get stuck in an infinite loop and won't print anything. Is something about lazy evalutation that I miss here?

.

module Main where

newtype Parser a = Parser { apply :: String->[(a, String)] }

instance Monad Parser where
    return a = Parser $ \s -> [(a, s)]
    ma >>= k = Parser $ \s -> concat [apply (k a) s' | (a, s') <- apply ma s]

instance Applicative Parser where
    pure = return
    mf <*> ma = do { f <- mf; f <$> ma; }

instance Functor Parser where
    fmap f ma = f <$> ma

empty :: Parser a
empty = Parser $ const []

anychar :: Parser Char
anychar = Parser f where
    f [] = []
    f (c:s) = [(c, s)]

satisfy :: (Char -> Bool) -> Parser Char
satisfy prop = do
    c <- anychar
    if prop c then return c
    else empty

char :: Char -> Parser Char
char c = satisfy (== c)

string :: String -> Parser String
string [] = empty
string [c] = do char c; return [c] --- if I remove this line, all results will be []
string (c:s) = do char c; string s; return (c:s)

main = do
    let s = "12345"
    print $ apply (string "123") s
    print $ apply (string "12") s
    print $ apply (string "1") s
    print $ apply (string []) s

PS. I think the title of the question is not suggestive enough, please propose an edit if you have a better idea.

Upvotes: 2

Views: 166

Answers (2)

  1. Since you did string [] = empty instead of string [] = return [], you can't use it as a base case for recursion that builds up a list.
  2. fmap f ma = f <$> ma is wrong, since <$> is defined in terms of fmap. If you want to define fmap in terms of your other instances, then do fmap = liftA or fmap = liftM. Since mapM uses fmap internally but your original string didn't, this problem didn't come up in your first simple test.

Upvotes: 2

chi
chi

Reputation: 116174

string [] = empty

means: "If you need to parse an empty string, fail -- it can not be parsed at all, no matter what's the input string".

By comparison,

string [] = return ""

means: "If you need to parse an empty string, succeed and return the empty string -- it can always be parsed, no matter what's the input string".

By using the first equation, when you recurse in the case string (c:cs) you need to stop at one character (string [c]) since reaching zero characters will run empty and make the whole parser fail.

Hence, you need to either use that string [c] = return [c] equation, or modify the base "empty string" case so that it succeeds. Arguably, the latter would be more natural.

Upvotes: 2

Related Questions