Reputation: 33329
I wrote a Scala parser for an in-house expression language that has double quote-delimited string literals:
object MyParser extends JavaTokenParsers {
lazy val strLiteral = "\"" ~> """[^"]*""".r <~ "\"" ^^ {
case x ⇒ StringLiteral(x)
}
// ...
}
(The actual code is a bit different since I support ""
as an escape sequence for a literal double quote. While this is not relevant for the discussion, it's the reason why I cannot just use JavaTokenParsers
's stringLiteral
).
I noticed that the parser fails to include whitespace at the beginning and at the end of a string:
"a" parsed as StringLiteral("a")
" a" parsed as StringLiteral("a")
"a " parsed as StringLiteral("a")
" a " parsed as StringLiteral("a")
I tried matching whitespace in the regex:
"\"" ~> """\s*[^"]*\s*""".r <~ "\""
and also using the explicit whiteSpace
parser:
"\"" ~> whiteSpace.? ~ """[^"]*""".r ~ whiteSpace.? <~ "\""
but in both cases the ~>
operator has already consumed and ignored the spaces before there's a chance to read and handle them.
I know that I can set skipWhitespace = false
, but I prefer not to, since in general I want to allow arbitrary whitespace around tokens in this language.
What's a simple and clean strategy to include surrounding whitespace in string literals?
Upvotes: 2
Views: 152
Reputation: 10882
One option you have is to use single regexp for your string literal:
val stringLiteral:Parser[String] = """"([^"]*("")?)*"""".r
and then strip matched quotes afterwards.
Upvotes: 1