Reputation: 23633
Suppose I have a regex that has one capturing group. Is there an easy way in scala to replace this capturing group with a replacement string? I've only been able to find functionality for replacing an entire regex with some content, which may include a capturing group, but the full regex match is not included in the replaced string. To give a concrete example:
val p = """^[bf]oo: '(.*)'"""r
println(p.replaceFirstGroup("foo: 'replace me'", "asdf")) // something like this
with output
foo: 'asdf'
Upvotes: 1
Views: 2963
Reputation: 33029
Using lookahead and lookbehind (as defined for java.util.regex.Pattern
), along with String.replaceFirst
would give you the desired results:
val p = """(?<=^[bf]oo: ').*(?=')"""
println("foo: 'replace me'".replaceFirst(p, "asdf"))
// => foo: 'asdf'
The lookahead (?=)
and lookbehind (?<=)
both match text without including it as part of the match result. This is why replaceFirst
only replaces the part not included in the lookahead or lookbehind, i.e. the .*
between the single quote marks.
Alternatively (and I'd probably prefer this solution), you can match all of the pieces, and reference the pieces that you want to leave unchanged in the replacement string using the $
{group-id}
syntax:
val p = """(^[bf]oo: ')(.*)(')"""
println("foo: 'replace me'".replaceFirst(p, "$1asdf$3"))
// => foo: 'asdf'
I know that's not technically replacing the first capture group, but lookahead and lookbehind always make me feel dirty. (I know, ironic right? We're already using regular expressions here!)
I was hoping to come up with something else since the lookahead limits the complexity of the regex that is in the lookahead portion and matching many groups adds extra complexity to the extractors and replacement code.
This is a bit more cumbersome to implement (you have to write some extra code), but it would keep your extractors uncluttered while also avoiding lookaheads/lookbehinds:
import scala.util.matching.Regex
implicit class MyRegExOps(val pattern: Regex) extends AnyVal {
def replaceFirstGroup(target: String, replacement: String): Option[String] = {
for (matched <- pattern.findFirstMatchIn(target))
yield "%s%s%s".format(
matched.group(0).substring(0, matched.start(1)),
replacement,
matched.group(0).substring(matched.end(1)))
}
}
// Notice that the next two lines exactly match your original post
val p = """^[bf]oo: '(.*)'"""r
println(p.replaceFirstGroup("foo: 'replace me'", "asdf"))
// => Some(foo: 'asdf')
Upvotes: 3
Reputation: 12573
Perhaps replaceSomeIn
method might be helpful here?
(quoting their example from ScalaDoc):
import scala.util.matching.Regex._
val map = Map("x" -> "a var", "y" -> """some $ and \ signs""")
val text = "A text with variables %x, %y and %z."
val varPattern = """%(\w+)""".r
val mapper = (m: Match) => map get (m group 1) map (quoteReplacement(_))
val repl = varPattern replaceSomeIn (text, mapper)
In your case:
val p = """^([bf]oo): '(.*)'"""r
val map = Map("foo" -> "foo: 'asdf'")
val lines = List("boo: 'bar' and beyond","foo: 'yuck' whatever")
val mapper = (m: Match) => map get (m group 1) map (quoteReplacement(_))
scala> val repl = text map { line => p replaceSomeIn(line, mapper) }
m: boo: 'bar' boo
m: foo: 'yuck' foo
repl: List[String] = List(boo: 'bar' and beyond, foo: 'asdf' whatever)
Upvotes: 0