Reputation: 6958
Why does this Parsec permutation parser not parse b?
p :: Parser (String, String)
p = permute (pair
<$?> ("", pa)
<|?> ("", pb))
where pair a b = (a, b)
pa :: Parser String
pa = do
char 'x'
many1 (char 'a')
pb :: Parser String
pb = do
many1 (char 'b')
λ> parseTest p "xaabb"
("aa","bb") -- expected result, good
λ> parseTest p "aabb"
("","") -- why "" for b?
Parser pa
is configured as optional via <$?>
so I don't understand why its failing has impacted the parsing of b. I can change it to optional (char 'x')
to get the expected behavior, but I don't understand why.
pa :: Parser String
pa = do
optional (char 'x')
many1 (char 'a')
pb :: Parser String
pb = do
optional (char 'x')
many1 (char 'b')
λ> parseTest p "xaaxbb"
parse error at (line 1, column 2):
unexpected "a"
expecting "b"
λ> parseTest p "xbbxaa"
("aa","bb")
How can both input orderings be supported when we have identical shared prefix "x"?
I also don't understand the impact that consumption of the optional "x" is having on the parse behavior:
pb :: Parser String
pb = do
try px -- with this try x remains unconsumed and "aa" gets parsed
-- without this try x is consumed, but "aa" isn't parsed even though "x" is optional anyway
many1 (char 'b')
px :: Parser Char
px = do
optional (char 'x')
char 'x' <?> "second x"
λ> parseTest p "xaaxbb" -- without try on px
parse error at (line 1, column 2):
unexpected "a"
expecting second x
λ> parseTest p "xaaxbb" -- with try on px
("aa","")
Upvotes: 1
Views: 253
Reputation: 32319
parseTest p "aabb"
gives ("","")
The permutation parser tries to strip off the front of the given string prefixes that can be parsed by its constituent parsers (pa
and pb
in this case). Here, it will have tried to apply both pa
and pb
to "aabb"
and failed in both cases - it never even gets around to trying to parse "bb"
.
pa
and pb
start with optional (char 'x')
Looking at permute
, you'll see it uses choice
, which in turn relies on (<|>)
. As the documentation of (<|>)
says,
This combinator implements choice. The parser
p <|> q
first appliesp
. If it succeeds, the value ofp
is returned. Ifp
fails without consuming any input, parserq
is tried. This combinator is defined equal to themplus
member of theMonadPlus
class and the(<|>)
member ofAlternative
.The parser is called predictive since
q
is only tried when parserp
didn't consume any input (i.e.. the look ahead is 1). This non-backtracking behaviour allows for both an efficient implementation of the parser combinators and the generation of good error messages.
So when you do something like parseTest p "xbb"
, pa
doesn't fail immediately (it consumes and 'x'
) and then the whole thing fails because it cannot backtrack.
As Daniel has suggested, it is best to factor out your grammar. Alternately, you can use try
:
The parser
try p
behaves like parserp
, except that it pretends that it hasn't consumed any input when an error occurs
Based on what we talked about before for (<|>)
, you ought then to put try
in front of both of optional (char 'x')
.
Upvotes: 5
Reputation: 153102
Why does this Parsec permutation parser not parse b?
Because 'a'
is not a valid first character for either parser pa
or parser pb
.
How can both input orderings be supported when we have identical shared prefix "x"?
Shared prefixes must be factored out of your grammar; or backtracking points inserted (using try
) at the cost of performance.
Upvotes: 2