Reputation: 183
I have a very simple parser with pretty standard parsing functions used with Haskell. I have the parser recognizing what I want, but the problem is that is accepts any extra input after what I'm looking for has been recognized. The simple example code I'll give should recognize a string "x", then return, Just 'x', or return Nothing for an input string other than "x". Given input "x", it returns Just 'x', as it should, given the string "d", it returns Nothing, as it should, given "dx", it returns Nothing, as it should, but if given "xd", it will return Just 'x', while I want it to return Nothing.
newtype Parser a = P (String -> [(a,String)])
instance Functor Parser where
fmap g p = P (\inp -> case parse p inp of
[] -> []
[(v,out)] -> [(g v, out)])
instance Applicative Parser where
pure v = P (\inp -> [(v,inp)])
pg <*> px = P (\inp -> case parse pg inp of
[] -> []
[(g,out)] -> parse (fmap g px) out)
instance Monad Parser where
p >>= f = P (\inp -> case parse p inp of
[] -> []
[(v,out)] -> parse (f v) out)
parse (P p) inp = p inp
item :: Parser Char
item = P (\inp -> case inp of
[] -> []
(x:xs) -> [(x,xs)])
sat :: (Char -> Bool) -> Parser Char
sat p = do x <- item
if p x then return x else empty
empty = P (\inp -> [])
test = do { x <- sat (\x -> x == 'x'); return x}
run s = case parse test s of
[(a, _)] -> Just a
otherwise -> Nothing
given:
run "xd"
returns:
Just 'x'
My most reasonable attempt was to try to recognize anything that's not an alpha-numeric character, but that just seems to make the parser try to parse beyond the string length and it returns Nothing. Is there a standard way of dealing with this?
Upvotes: 2
Views: 319
Reputation: 2570
The fix is actually really simple. Just check that the remaining string after parsing is empty in the run
function. Like this:
run s = case parse test s of
[(a, "")] -> Just a
otherwise -> Nothing
Now any spurious characters will cause the function to return Nothing as you want.
An alternative is to split the check into a separate eof
parser that succeeds if the string is empty and fails if there are any characters left and add that to the end of your parser.
Upvotes: 6