Cocophony
Cocophony

Reputation: 23

Haskell parsing using an accumulator with Parsec

Haskell beginner here.

Say that I have a parser that I supply with some information and it returns the results of parsing a section and the info needed for the next section.

readSection :: Info -> Parser (Section, Info)

I was hoping to be able to parse multiple sections using the many combinator, but I can't get the types to work. I need some way of having a parser take in the Info of the previous computation. This structure reminds me of a fold (since the previous results can just be stored in the accumulator), but I'm not sure how to apply that here.

readSections :: Info -> Parser [Section]
readSections startInfo = do
  let parser = readSection startInfo
  -- ???
  return $ many readSection

Maybe there's a better way of doing something like this, so any guidance would be greatly appreciated. Thanks!

Upvotes: 2

Views: 139

Answers (2)

K. A. Buhr
K. A. Buhr

Reputation: 50864

If you're looking for an existing combinator, this looks like an application of unfoldrM from monad-loops (among other places). To avoid having to import it, it can be defined as:

unfoldrM :: Monad m => (a -> m (Maybe (b, a))) -> a -> m [b]
unfoldrM f a = do
  r <- f a
  case r of
    Nothing -> return []
    Just (b, a') -> (b:) <$> unfoldrM f a'

Basically, it takes an initial seed of type a and uses it to monadically generate values and new seeds. Your seed is your info: you use the info to monadically generate a parsed section plus a new version of the info.

You just have to stick in an optionMaybe to provide the Just/Nothing switch so unfoldrM knows when it's reached the end of all the sections:

readSections = unfoldrM (optionMaybe . readSection)

However, as a Haskell beginner, it may be helpful to see how you do this sort of thing from scratch. The many combinator isn't magical. It's basically equivalent to the relatively simple monadic computation:

many p = (do x <- p
             xs <- many p
             return (x:xs))
         <|> return []

So, you can write readSections from scratch as a monadic computation in a similar manner:

readSections' :: Info -> Parser [Section]
readSections' i = (do -- try to read a section, getting updated info
                      (s, i') <- readSection i
                      -- read zero or more remaining sections with updated info
                      ss <- readSections' i'
                      -- return the sections
                      return (s:ss))
                  -- if no more sections to read, return an empty list
                  <|> return []

Upvotes: 4

danidiaz
danidiaz

Reputation: 27756

It would be possible to write an "extended" version of many in an explicit way, but this answer uses StateT from "transformers" in order to reuse an already existing Alternative instance.

The function

readSection :: Info -> Parser (Section, Info)

Is quite reminiscent of what goes into the StateT constructor:

StateT :: (Info -> Parser (Section, Info)) -> StateT Info Parser Section

where Info would be the state and Parser the base monad. That is: each time we parse a Section, we modify the Info that the next parse attempt will use.

Furthermore, StateT has the following instance:

(Functor m, MonadPlus m) => Alternative (StateT s m)

If the base monad is a MonadPlus (like Parsec is) then StateT over the monad is an Alternative. This means that we can use the many from Alternative—but not the more restricted many :: ParsecT s u m a -> ParsecT s u m [a] from "parsec".

So we wan write:

import Control.Applicative
import Control.Monad.Trans.State.Strict

readSections :: Info -> Parser [Section]
readSections = evalStateT (Control.Applicative.many parser) 
  where
    parser :: StateT Info Parser Section
    parser = StateT readSection

Using evalStateT, which brings us back to a simple Parser and discards the final Info state.

Upvotes: 3

Related Questions