Reputation: 3711
Let's say I have different parsers p1, ..., pk
. I want to define a function pk :: Parser ([t1], ..., [tk])
where pi :: Parser ti
.
That will parse a collection of strings (one after the other) that match any of p1...pk and separate them in the corresponding lists. Assume for simplicity that none of the strings matches two parsers.
I managed to do it, but I am really struggling to find an elegant way (preferably using many or any other builtin parsec parser).
Upvotes: 3
Views: 739
Reputation: 77424
You can't do this completely generically without type-level trickery, unfortunately. However, an approach along the lines of Daniel Wagner's suggestion can be constructed in DIY style, using polymorphic combinators and some pre-existing recursive instances. I'll present a simple example to demonstrate.
First, a few simpleminded parsers to test with:
type Parser = Parsec String ()
parseNumber :: Parser Int
parseNumber = read <$> many1 digit
parseBool :: Parser Bool
parseBool = string "yes" *> return True
<|> string "no" *> return False
parseName :: Parser String
parseName = many1 letter
Next we create a "base case" type to mark the end of the possibilities, and give it a parser that always succeeds on no input.
data Nil = Nil deriving (Eq, Ord, Read, Show, Enum, Bounded)
instance Monoid Nil where
mempty = Nil
mappend _ _ = Nil
parseNil :: Parser Nil
parseNil = return Nil
This is equivalent to ()
, of course--I'm only creating a new type to disambiguate in case we actually want to parse ()
. Next, the combinator that chains parsers together:
infixr 3 ?|>
(?|>) :: (Monoid b) => Parser a -> Parser b -> Parser ([a], b)
p1 ?|> p2 = ((,) <$> ((:[]) <$> p1) <*> pure mempty)
<|> ((,) <$> pure [] <*> p2)
What's going on here is that p1 ?|> p2
tries p1
, and if it succeeds wraps that in a singleton list and fills in mempty
for the second element of the tuple. If p1
fails, if fills in an empty list and uses p2
for the second element.
parseOne :: Parser ([Int], ([Bool], ([String], Nil)))
parseOne = parseNumber ?|> parseBool ?|> parseName ?|> parseNil
Combining parsers with the new combinator is simple, and the result type is pretty self-explanatory.
parseMulti :: Parser ([Int], ([Bool], ([String], Nil)))
parseMulti = mconcat <$> many1 (parseOne <* newline)
We're relying on the recursive Monoid
instance for tuples and the usual instance for lists to combine multiple lines. Now, to show that it works, define a quick test:
runTest = parseTest parseMulti testInput
testInput = unlines [ "yes", "123", "acdf", "8", "no", "qq" ]
Which parses successfully as Right ([123,8],([True,False],(["acdf","qq"],Nil)))
.
Upvotes: 5
Reputation: 153332
The first step is to turn each of your parsers into parser of the big type:
p1 :: Parser t1
p2 :: Parser t2
p3 :: Parser t3
p1 = undefined
p2 = undefined
p3 = undefined
p1', p2', p3' :: Parser ([t1], [t2], [t3])
p1' = fmap (\x -> ([x], [], [])) p1
p2' = fmap (\x -> ([], [x], [])) p2
p3' = fmap (\x -> ([], [], [x])) p3
Now, we repeatedly choose from these last parsers, and concatenate the results at the end:
parser :: Parser ([t1], [t2], [t3])
parser = fmap mconcat . many . choice $ [p1', p2', p3']
There are Monoid
instances for tuples up to size five; beyond that, you can either use nested tuples or a more appropriate data structure.
Upvotes: 6
Reputation: 40797
Representing the parsers as a list makes this easy. Using:
choice :: [Parser a] -> Parser a
many :: Parser a -> Parser [a]
We can write:
combineParsers :: [Parser a] -> Parser [a]
combineParsers = many . choice
This isn't quite right, as it bundles them all into a single list. Let's associate each parser with a unique identifier:
combineParsers' :: [(k, Parser a)] -> Parser [(k, a)]
combineParsers' = many . choice . combine
where combine = map (\(k,p) -> (,) k <$> p)
We can then convert this back to the list form:
combineParsers :: [Parser a] -> Parser [[a]]
combineParsers ps = map snd . sortBy fst <$> combineParsers' (zip [0..] ps)
You could perhaps make this more efficient by writing combineParsers' :: [(k, Parser a)] -> Parser (Map k [a])
instead, using e.g. combine = map $ \(k,p) -> fmap (\a -> Map.insertWith (++) k [a]) p
.
This requires that all the parsers have the same result type, so you'll need to wrap each of their results in a data-type with Cons <$> p
for each p. You can then unwrap the constructor from each list. Admittedly, this is fairly ugly, but to do it heterogeneously with tuples would require even uglier type-class hacks.
There might be a simpler solution for your specific use-case, but this is the easiest way to do it generically that I can think of.
Upvotes: 5