recursion.ninja
recursion.ninja

Reputation: 5488

How to return multiple parse failures within Parsec's monadic context?

I have a grammar I am parsing which consists of exactly two required and unique logical parts, Alpha and Beta. These parts can be defined in any order, Alpha before Beta or visa-vera. I would like to provide robust error messages for the less tech-savvy users.

In the example below there are cases where multiple parse failures exist. I concatenate the failure message Strings with the unlines function and pass the resulting concatenation into the fail combinator. This creates a ParseError value with a single Message value when parse is called on grammarDefinition.

Example Scenario:

import Data.Either                   (partitionEithers)
import Data.Set                      (Set)
import Text.Parsec                   (Parsec)
import Text.Parsec.Char
import Text.ParserCombinators.Parsec

data Result = Result Alpha Beta
type Alpha  = Set (Int,Float)
type Beta   = Set String

grammarDefinition :: Parsec String u Result
grammarDefinition = do
    segments <- partitionEithers <$> many segment
    _        <- eof
    case segments of
      (     [],      []) -> fail $ unlines [missingAlpha, missingBeta]
      (      _,      []) -> fail $ missingBeta
      (     [],       _) -> fail $ missingAlpha
      ((_:_:_), (_:_:_)) -> fail $ unlines [multipleAlpha, multipleBeta]
      (      _, (_:_:_)) -> fail $ multipleBeta
      ((_:_:_),       _) -> fail $ multipleAlpha
      (    [x],     [y]) -> pure $ Result x y
    where
      missingAlpha     = message "No" "alpha"
      missingBeta      = message "No" "beta"
      multipleAlpha    = message "Multiple" "alpha"
      multipleBeta     = message "Multiple" "beta"
      message x y      = concat [x," ",y," defined in input, ","exactly one ",y," definition required"]

-- Type signature is important!
segment :: Parsec String u (Either Alpha Beta)
segment = undefined -- implementation irrelevant

I would like the ParseError to contain multiple Message values in the case of multiple failures. This should be possible due to the existence of the addErrorMessage function. I am not sure hw to supply multiple failure within the Parsec monadic context, before the result is materialized by calling parse.

Example Function:

fails :: [String] -> ParsecT s u m a
fails = undefined -- Not sure how to define this!

How do I supply multiple Message values to the ParseError result within Parsec's monadic context?

Upvotes: 1

Views: 361

Answers (3)

ErikR
ErikR

Reputation: 52039

fail in this case is equivalent to parserFail defined in Text.Parsec.Prim:

parserFail :: String -> ParsecT s u m a
parserFail msg
    = ParsecT $ \s _ _ _ eerr ->
      eerr $ newErrorMessage (Message msg) (statePos s)

Since newErrorMessage and addErrorMessage both create a ParseError, this variation of parserFail should also work:

parserFail' :: String -> ParsecT s u m a
parserFail' msg
    = ParsecT $ \s _ _ _ eerr ->
      eerr $ theMessages s
where
  theMessages s =
    addErrorMessage (Message "blah") $
      addErrorMessage (Expect "expected this") $
        newErrorMessage (Message msg) (statePos s)

which should push 3 messages onto the error message list.

Also in that module, have a look at label and labels which is the only place where addErrorMessage is used. labels is just a multi-message version of the <?> operator. Note how it uses foldr to build up a compound error message:

labels :: ParsecT s u m a -> [String] -> ParsecT s u m a
labels p msgs =
    ParsecT $ \s cok cerr eok eerr ->
    let eok' x s' error = eok x s' $ if errorIsUnknown error
                  then error
                  else setExpectErrors error msgs
        eerr' err = eerr $ setExpectErrors err msgs

    in unParser p s cok cerr eok' eerr'

 where
   setExpectErrors err []         = setErrorMessage (Expect "") err
   setExpectErrors err [msg]      = setErrorMessage (Expect msg) err
   setExpectErrors err (msg:msgs)
       = foldr (\msg' err' -> addErrorMessage (Expect msg') err')
         (setErrorMessage (Expect msg) err) msgs

The only gatcha is that you need access to the ParsecT constructor which is not exported by Text.Parsec.Prim. Maybe you can find a way to use labels or another way around that problem. Otherwise you could always include your own hacked version of parsec with your code.

Upvotes: 2

recursion.ninja
recursion.ninja

Reputation: 5488

I would recommend transitioning from Parsec to newer and more extensible Megaparsec library.

This exact issue has been resolved since version 4.2.0.0.

Multiple parse error Messages can easily be created with the following function:

fails :: MonadParsec m => [String] -> m a
fails = failure . fmap Message

Upvotes: 0

recursion.ninja
recursion.ninja

Reputation: 5488

We can leverage the fact that ParsecT is an instance of MonadPlus to combine the definition of mzero with the function labels to derive the desired result:

fails :: [String] -> ParsecT s u m a
fails = labels mzero

Note: The ParseError has many Expect values, not many Message values...

Upvotes: 0

Related Questions