Robert Campbell
Robert Campbell

Reputation: 6958

Please explain the behavior of this Parsec permutation parser

Why does this Parsec permutation parser not parse b?

p :: Parser (String, String)
p = permute (pair
              <$?> ("", pa)
              <|?> ("", pb))
  where pair a b = (a, b)

pa :: Parser String
pa = do
  char 'x'
  many1 (char 'a')

pb :: Parser String
pb = do
  many1 (char 'b')

λ> parseTest p "xaabb"
("aa","bb")               -- expected result, good
λ> parseTest p "aabb"
("","")                   -- why "" for b?

Parser pa is configured as optional via <$?> so I don't understand why its failing has impacted the parsing of b. I can change it to optional (char 'x') to get the expected behavior, but I don't understand why.

pa :: Parser String
pa = do
  optional (char 'x')
  many1 (char 'a')

pb :: Parser String
pb = do
  optional (char 'x')
  many1 (char 'b')

λ> parseTest p "xaaxbb"
parse error at (line 1, column 2):
unexpected "a"
expecting "b"
λ> parseTest p "xbbxaa"
("aa","bb")

How can both input orderings be supported when we have identical shared prefix "x"?

I also don't understand the impact that consumption of the optional "x" is having on the parse behavior:

pb :: Parser String
pb = do
  try px -- with this try x remains unconsumed and "aa" gets parsed
         -- without this try x is consumed, but "aa" isn't parsed even though "x" is optional anyway
  many1 (char 'b')

px :: Parser Char
px = do
  optional (char 'x')
  char 'x' <?> "second x"

λ> parseTest p "xaaxbb"            -- without try on px
parse error at (line 1, column 2):
unexpected "a"
expecting second x
λ> parseTest p "xaaxbb"            -- with try on px
("aa","")

Upvotes: 1

Views: 253

Answers (2)

Alec
Alec

Reputation: 32319

Why parseTest p "aabb" gives ("","")

The permutation parser tries to strip off the front of the given string prefixes that can be parsed by its constituent parsers (pa and pb in this case). Here, it will have tried to apply both pa and pb to "aabb" and failed in both cases - it never even gets around to trying to parse "bb".

Why can't both pa and pb start with optional (char 'x')

Looking at permute, you'll see it uses choice, which in turn relies on (<|>). As the documentation of (<|>) says,

This combinator implements choice. The parser p <|> q first applies p. If it succeeds, the value of p is returned. If p fails without consuming any input, parser q is tried. This combinator is defined equal to the mplus member of the MonadPlus class and the (<|>) member of Alternative.

The parser is called predictive since q is only tried when parser p didn't consume any input (i.e.. the look ahead is 1). This non-backtracking behaviour allows for both an efficient implementation of the parser combinators and the generation of good error messages.

So when you do something like parseTest p "xbb", pa doesn't fail immediately (it consumes and 'x') and then the whole thing fails because it cannot backtrack.

How to make shared prefixes work?

As Daniel has suggested, it is best to factor out your grammar. Alternately, you can use try:

The parser try p behaves like parser p, except that it pretends that it hasn't consumed any input when an error occurs

Based on what we talked about before for (<|>), you ought then to put try in front of both of optional (char 'x').

Upvotes: 5

Daniel Wagner
Daniel Wagner

Reputation: 153102

Why does this Parsec permutation parser not parse b?

Because 'a' is not a valid first character for either parser pa or parser pb.

How can both input orderings be supported when we have identical shared prefix "x"?

Shared prefixes must be factored out of your grammar; or backtracking points inserted (using try) at the cost of performance.

Upvotes: 2

Related Questions