thesamet
thesamet

Reputation: 6582

How to write a Parser that validates its input against a predicate and otherwise fails

I want to write a Parser that produces some data structure and validates its consistency by running a predicate on it. In case the predicate returns false the parser should return a custom Error object (as opposed to a Failure, since this can be achieved by ^?).

I am looking for some operator on parser that can achieve that. For example, let's say that I want to parse a list of integers and check that they are distinct. I would like to have something like this:

import util.parsing.combinator.RegexParsers

object MyParser extends RegexParsers {
  val number: Parser[Int] = """\d+""".r ^^ {_.toInt }
  val list = repsep(number, ",") ^!(checkDistinct, "numbers have to be unique")

  def checkDistinct(numbers: List[Int]) = (numbers.length == numbers.distinct.length)
}

The ^! in the code above is what I am looking for. How can I validate a parser output and return a useful error message if it does not validate?

Upvotes: 1

Views: 268

Answers (4)

thesamet
thesamet

Reputation: 6582

Here is a complete Pimp My Library implementation:

implicit def validatingParsers[T](parser: Parser[T]) = new {
  def ^!(predicate: T => Boolean, error: => String) = Parser { in =>
    parser(in) match {
      case s @Success(result, sin) => predicate(result) match {
        case true => s
        case false => Error(error, sin)   // <--
      }
      case e @NoSuccess(_, _) => e
    }
  }
}

The new operator ^! transforms the parser on the left to a new parser that applies the predicate.

One important thing to note is the sin on the line marked with <--. Because the Error that is eventually returned by Scala's parser library is the one in the latest position in the input, it is crucial to pass sin in that line instead of in, as sin represents the point where the inner parser completed its own parsing.

If we passed in instead of sin, the error that would eventually be reported could be the latest failure that happened during the parsing of the inner rule (which we know that eventually succeeded if we got to that line).

Upvotes: 1

Didier Dupont
Didier Dupont

Reputation: 29548

Parsers.commit transforms Failure to Error. So a first step would be

commit(p ^?(condition, message))

However this would give an error if p gives a failure, which I suppose is not what you want, you want an error only when p succeeds and then the check fails. So you should rather do

p into {result => commit(success(result) ^? (condition,message))}

That may sound rather contrived, you may also implement directly, just copy the implementation of ^? replacing failure with an error.

Finally you should probably do as suggested by Dylan and add the operator. If you want to do it outside of your grammar (Parsers) , I think you will need a mixin:

trait PimpedParsers { self: Parsers => 
   implicit def ...
}

Otherwise you cannot easily refer to (single) Parser.

Upvotes: 2

Dylan
Dylan

Reputation: 13922

One way to achieve this would be to use the Pimp My Library pattern to add the ^! operator to Parser[List[T]] (the return type of repsep). Define an implicit def, then import it into scope when you need to use it:

class ParserWithMyExtras[T](val parser:Parser[List[T]]){
  def ^!(predicate:List[T]=>Boolean, errorMessage:String) = {...}
}

implicit def augmentParser[T](parser:Parser[List[T]]) = 
  new ParserWithMyExtras(parser)

Upvotes: 3

Alexander Azarov
Alexander Azarov

Reputation: 13221

^? accepts an error message generator, commit converts a Failure to an Error:

val list = commit {
  repsep(number, ",") ^? (
    { case numbers if checkDistinct(numbers) => true},
    _ => "numbers have to be unique" )
}

Upvotes: 0

Related Questions