Derek Wyatt
Derek Wyatt

Reputation: 2727

scala regex replaceAllIn can't replace when replace string looks like a regex?

I've been happily running a Regex replaceAllIn for quite a while but ran into a problem when the replacement string had something that looked like a regex in it. The following illustrates the problem (Scala 2.9.1-1). Note that the real problem space is much more complex, so the idea of using a simpler solution isn't really tenable (just to preempt the inevitable "Why don't you try ..." :D)

val data = "val re = \"\"\"^[^/]*://[^/]*/[^/]*$\"\"\".r"
val source = """here
LATEX_THING{abc}
there"""
val re = "LATEX_THING\\{abc\\}".r
println(re.replaceAllIn(source, data))

This presents with the following error:

java.lang.IllegalArgumentException: Illegal group reference

If I change data from what it was to something simple like:

val data = "This will work"

Then everything's fine.

It looks like replaceAllIn is somehow looking in the second string and using it as another RE to reference what was remembered from the first RE... but the docs say nothing about this.

What am I missing?

edit: Ok, so after looking at the java.util.regex.Matcher class, it would seem that the intended fix is:

re.replaceAllIn(source, java.util.regex.Matcher.quoteReplacement(data))

Upvotes: 8

Views: 1921

Answers (1)

Travis Brown
Travis Brown

Reputation: 139058

You need to escape the $ in your replacement string:

val data = "val re = \"\"\"^[^/]*://[^/]*/[^/]*\\$\"\"\".r"

Otherwise it's interpreted as the beginning of a group reference (which would only be valid if the $ were followed by one or more digits). See the documentation for java.util.regex.Matcher for more detail:

The replacement string may contain references to subsequences captured during the previous match: Each occurrence of $g will be replaced by the result of evaluating group(g)... A dollar sign ($) may be included as a literal in the replacement string by preceding it with a backslash (\$).

Update to address your comment and edit above: Yes, you can use Matcher.quoteReplacement if you're not working with string literals (or if you are, I guess, but escaping the $ seems easier in that case), and there's at least a chance that quoteReplacement will be available as a method on scala.util.matching.Regex in the future.

Upvotes: 10

Related Questions