Reputation: 101
I want my regex to print everything before a { or {{ (not including them.
What I have so far is:
class ExpressionParser extends RegexParsers {
val regExpr = """^.*?((?=\{{2})|(?=\{)|$)""".r //not sure about the "$". Added it because test case 1 wasn't printing. see below
def program: Parser[Any] = regExpr
}
and here are my tests:
object Test {
def main(args: Array[String]): Unit = {
val p = new ExpressionParser()
val test = p.parseAll(p.program, 'tests go here') // doesn't print anything
if(test.successful) println(test.get)
// replace 'tests go here' with each of these
//"This is plain text so should always print") // this isn't printing so make checks for { optional
//"abc {{"
//"abc de{ fg{{{ hi"
//"abc } {{ {{ de{' fg{{{ hi")
}
}
I want it to print:
//This is plain text so should always print
//abc
//abc de
//abc {
Only the first test prints. Why?
Thanks !
Upvotes: 0
Views: 2529
Reputation: 582
Scroll down to edit to show answer after poster became more specific with answer
I've never heard of an ExpressionParser built into the Scala API, but if you want to get everything up to a certain point or between two things you can use
(?s)(.*)
So to get everything before the letter 'a' you would use...
(?s)(.*)a
Code example:
val regex2 = """(?s)(.*)a""".r
val str1 = "somethinga"
str1 match {
case regex2(left) => println(left)
}
This will print "something" without quotes
Edit: Since you have now updated your answer to show you are using RegexParsers, here would be a solution using that, though quite over-the-top and unnecessary if this is all you are using RegexParsers for.
class ExpressionParser extends RegexParsers {
def remover: Parser[String] = """.*(?=\{)|.*""".r
}
In main:
val p = new ExpressionParser()
val test = p.parseAll(p.remover, "tests go here{")// doesn't print anything
if (test.successful) println(test.get) // prints "tests go here"
Was able to figure this out by reading RegexParser documentation here: https://github.com/scala/scala-parser-combinators and https://github.com/scala/scala-parser-combinators/blob/1.1.x/docs/Getting_Started.md
As for an explanation of this if the documentation still doesn't make sense, this is solved using "lookahead groups" which will look ahead of the previous group for the pattern matching the lookahead group and exclude it from the result.
Therefore, once you hit a {, it will match the expression of everything up to the { and return that.
Now the reason for the | is it will initially try to match "everything followed by a {" but if it doesn't, there would be an issue. Therefore, we must use an "or (|)" to say if there isn't a {, just use everything.
The reason why we cant just add a ? to the left part of the | at the end of the lookahead group to make the lookahead group optional is it wouldn't actually remove the lookahead group. You can try it out if you want with this regex.
.*(?=\{)?
Upvotes: 1