Reputation: 8474
I need to write regex that replaces a
with b
but only inside <pre>
tag.
Example
a <pre> c a <foo> a d </pre> a
Result
a <pre> c b <foo> b d </pre> a
Please help writing expression for java String.replace
function. There is a guarantee that pre
tag is not nested.
Upvotes: 1
Views: 1612
Reputation: 13906
I think the best you can do with String.replace() is something like:
String string = ...
for (;;)
{
String original = string;
string = string.replaceFirst("(<pre>.*?)a(.*?</pre>)", "$1b$2");
if (original.equals(string))
break;
}
(EDIT: @Bohemian has noted the above regex doesn't work correctly. So it needs to be changed to:
(<pre>(?:(?!</pre>).)*a((?:(?!<pre>).)*</pre>)
(untested) to avoid matching outside a <pre>...</pre>
section. With this change, we don't need the *?
quantifier and can use the more common "greedy" (*
) quantifier. This is starting to look a lot like my other answer, which I only really meant as a joke!)
You're better off using a Matcher (following code off the top of my head):
import java.util.regex.Pattern;
import java.util.regex.Matcher;
Pattern pattern = Pattern.compile("(?<=<pre>)(.*?)(?=</pre>)");
Matcher m = pattern.matcher(string);
StringBuffer replacement = new StringBuffer();
while (matcher.find())
{
matcher.appendReplacement(replacement, "");
// Careful using unknown text in appendReplacement as any "$n" will cause problems
replacement.append(matcher.group(1).replace("a", "b"));
}
matcher.appendTail(replacement);
String result = replacement.toString();
Edit: Changed pattern above so that it does not match surrounding <pre>
and </pre>
.
Upvotes: 3
Reputation: 13906
Here's a regex that will do the job (I think: I wouldn't bet too much on it passing all tests )
String replacement = original.replaceAll(
"(?<=<pre>(?:(?!</pre>).){0,50})a(?=(?:(?!<pre>).)*</pre>)",
"b");
Explanation:
(?<=<pre>(?:(?!</pre>).){0,50})
- look-behind for a preceding
<pre>
so long as we don't traverse back over </pre>
to find it. Java requires a finite maximum length look-behind so we use {0,50}
rather than *
.a
- The character we want to replace(?=(?:.(?!<pre>))*</pre>)
- Look ahead for </pre>
so
long as we don't traverse past <pre>
to find it.Upvotes: 0
Reputation: 12843
Pattern pattern = Pattern.compile("<pre>(.+?)</pre>");
java.util.regex.Matcher matcher = pattern.matcher("a <pre> c a <tag> a d </pre> a");
Try this:
Upvotes: -1