Setting Whitespace as Delimiter in JavaTokenParsers

Question

Extending JavaTokenParsers, I have the following:

class Foo extends JavaTokenParsers { 
  lazy val check = id ~ action ~ obj

  lazy val id     = "FOO" | "BAR"
  lazy val action = "GET" | "SET"
  lazy val obj = "BAZ" | "BIZ"
}

I had assumed that whitespace would act as a delimiter. In other words, I became confused when check parsed the following expression successfully: FOO GETBAZ.

val result = parseAll(check, "FOO GETBAZ")
println(result.get)

Result

((FOO~GET)~BAZ)

How can I use whitespace as a delimiter, i.e. the above wouldn't parse successfully since GETBAZ does not match either of action's GET or SET?

Travis Brown · Accepted Answer

JavaTokenParser adds some methods to RegexParsers, but it doesn't change the behavior of literal, which will match its argument without worrying about what's around it.

The skipWhitespace setting isn't going to help you, either, since it only specifies whether whitespace will be ignored—not whether it's required.

You have a couple of options. One would be to use regular expressions with word boundaries:

class Foo extends JavaTokenParsers {
  def word(s: String): Parser[String] = regex(s"\b$s\b".r)

  lazy val check = id ~ action ~ obj    

  val id     = word("FOO") | word("BAR")
  val action = word("GET") | word("SET")
  val obj    = word("BAZ") | word("BIZ")
}

Or ident:

class Foo extends JavaTokenParsers {
  def word(s: String): Parser[String] = ident.filter(_ == s)

  lazy val check = id ~ action ~ obj    

  val id     = word("FOO") | word("BAR")
  val action = word("GET") | word("SET")
  val obj    = word("BAZ") | word("BIZ")
}

Or you could manually add whitespace parsers between each of your items.

I'd probably go with the \b solution, but it's largely a matter of taste and preferences.

Setting Whitespace as Delimiter in JavaTokenParsers

Answers (2)

Related Questions