user2345
user2345

Reputation: 3227

Combinator parser ~ also matches spaces

I am trying to parse string that follow the grammar (x|y)+. That is, the following should match:

x
y
xyyxyyxy
xyyxxy
and so on...

I have the following code:

import scala.util.parsing.combinator._

class XYs extends JavaTokenParsers {
  def E: Parser[Any] = (C ~ E) | C
  def C: Parser[Any] = "x" | "y"
}

object Main extends XYs {
  def main(args: Array[String]) {
    while (true) {
      println(parse(E, scala.io.StdIn.readLine()))
    }
  }
}

This parses the strings that should be matched, however it also matches some that should not be accepted, more precisely those with spaces.

xyy xyx works, as well as xyyxy xyyx xy. Is there an easy solution to make the spaces not part of the parsing? Maybe a different "operator" than ~?

Upvotes: 0

Views: 131

Answers (1)

ymonad
ymonad

Reputation: 12090

According to the document, skipWhitespace is turned on by default for RegexParsers, and also for JavaTokenParsers which is subclass of RegexParsers.

The parsing methods call the method skipWhitespace (defaults to true) and, if true, skip any whitespace before each parser is called.

You can just turn it off by overriding it.

class XYs extends JavaTokenParsers {
  override def skipWhitespace = false
  def E: Parser[Any] = (C ~ E) | C
  def C: Parser[Any] = "x" | "y"
}

Also you can use rep1 to match non empty repeats

class XYs extends JavaTokenParsers {
  override def skipWhitespace = false
  def E: Parser[Any] = rep1(C)
  def C: Parser[Any] = "x" | "y"
}

Upvotes: 1

Related Questions