Reputation: 24894
Code:
public static void main(String[] args) {
String mainTag = "HI";
String replaceTag = "667";
String text = "92<HI=/><z==//HIb><cHIhi> ";
System.out.println(strFormatted(mainTag, replaceTag, text));
mainTag = "aBc";
replaceTag = "923";
text = "<dont replacethis>abcabc< abcabcde >";
System.out.println(strFormatted(mainTag, replaceTag, text));
}
private static String strFormatted(String mainTag, String replaceTag, String text) {
return text.replaceAll("(?i)(?<=<)" + mainTag + "(?=.*>)", replaceTag);
}
So, I want to replace mainTag
(variable) for replaceTag
(variable) only inside tags (<...>
).
In the example above I want to replace the mainTag HI
(case insensitive) in all occurrences inside <...>
with 667
, but my code only replaces the first occurrence.
Examples:
92<HI=/><z==//HIb><cHIhi>
Expected output:
92<667=/><z==//667b><c667667>
(mainTag = "HI", replaceTag = "667")
<dont replacethis>abcabc<abcabcde>
Expected output:
<dont replacethis>abcabc<923923de>
(mainTag = "aBc", replaceTag = "923");
Note: My code is wrong not only because he replaces only 1 time, but also because it only works if the "mainTag" succeeds the "<", in other words, the lookbehind only works for an unique situation.
Upvotes: 1
Views: 799
Reputation: 213351
You just need look-ahead here. The idea is to find all the mainTags
, which are followed by a >
, and then matching pairs of <>
, and replace with replaceTag
. The following regex would work:
text.replaceAll("(?i)" + mainTag + "(?=[^<>]*>(?:[^<>]*<[^<>]*>)*[^<>]*)$", replaceTag);
Explanation:
(?i) # Ignore Case
mainTag # Match mainTag
(?= # which is followed by
[^<>]* # Some 0 or more characters which are not < or >
> # Close the bracket (this ensures, mainTag is between closing bracket
(?: # Start a group (to match pair of bracket)
[^<>]* # non-bracket characters
< # Start a bracket
[^<>]* # non-bracket characters
> # End the bracket
)* # Match the pair 0 or more times.
[^<>]* # Non-bracket characters 0 or more times.
)
[^<>]*)$
The above regex really assumes that brackets are always balanced. For unbalanced regex, this might give unexpected results. But then regex is not really the tool for such job.
Otherwise a regex a simple as this would also work fine:
"(?i)" + mainTag + "(?=[^<>]*>)"
that depends upon your use-case. This doesn't worry about balanced brackets. You can try the second one first, if it fits all scenario, then it's best.
Upvotes: 3