Reputation:
I have strings like this <p0=v0 p1=v1 p2=v2 ....>
and I want to swap pX
with vX
to have something like <v0=p0 v1=p1 v2=p2 ....>
using regexps.
I want only pairs in <>
to be swapped.
I wrote:
Pattern pattern = Pattern.compile("<(\\w*)=(\\w*)>");
Matcher matcher = pattern.matcher("<p1=v1>");
System.out.println(matcher.replaceAll("$2=$1"));
But it works only with a single pair pX=vX
Could someone explain me how to write regexp that works for multiple pairs?
Upvotes: 3
Views: 156
Reputation: 89565
You can use this pattern:
"((?:<|\\G(?<!\\A))\\s*)(p[0-9]+)(\\s*=\\s*)(v[0-9]+)"
To ensure that the pairs are after an opening angle bracket, the pattern start with:
(?:<|\\G(?<!\\A))
that means: an opening angle bracket OR at the end of the last match
\\G
is an anchor for the position immediatly after the last match or the begining of the string (in other words, it is the last position of the regex engine in the string, that is zero at the start of the string). To avoid a match at the start of the string I added a negative lookbehind (?<!\\A)
-> not preceded by the start of the string.
This trick forces each pair to be preceded by an other pair or by a <
.
example:
String subject = "p5=v5 <p0=v0 p1=v1 p2=v2 p3=v3> p4=v4";
String pattern = "((?:<|\\G(?<!\\A))\\s*)(p[0-9]+)(\\s*=\\s*)(v[0-9]+)";
String result = subject.replaceAll(pattern, "$1$4$3$2");
If you need p and v to have the same number you can change it to:
String pattern = "((?:<|\\G(?<!\\A))\\s*)(p([0-9]+))(\\s*=\\s*)(v\\3)";
String result = subject.replaceAll(pattern, "$1$5$4$2");
If parts between angle brackets can contain other things (that are not pairs):
String pattern = "((?:<|\\G(?<!\\A))(?:[^\s>]+\\s*)*?\\s*)(p([0-9]+))(\\s*=\\s*)(v\\3)";
String result = subject.replaceAll(pattern, "$1$4$3$2");
Note: all these patterns only checks if there is an opening angle bracket, but don't check if there is a closing angle bracket. If a closing angle bracket is missing, all pairs will be replaced until there is no more contiguous pairs for the two first patterns and until the next closing angle bracket or the end of the string for the third pattern.
You can check the presence of a closing angle bracket by adding (?=[^<>]*>)
at the end of each pattern. However adding this will make your pattern not performant at all. It is better to search parts between angle brackets with (?<=<)[^<>]++(?=>)
and to perform the replacement of pairs in a callback function. You can take a look at this post to implement it.
Upvotes: 0
Reputation:
If Java can do the \G
anchor, this will work for unnested <>'s
Find: ((?:(?!\A|<)\G|<)[^<>]*?)(\w+)=(\w+)(?=[^<>]*?>)
Replace (globally): $1$3=$2
Regex explained
( # (1 start)
(?:
(?! \A | < )
\G # Start at last match
|
< # Or, <
)
[^<>]*?
) # (1 end)
( \w+ ) # (2)
=
( \w+ ) # (3)
(?= [^<>]*? > ) # There must be a closing > ahead
Perl test case
$/ = undef;
$str = <DATA>;
$str =~ s/((?:(?!\A|<)\G|<)[^<>]*?)(\w+)=(\w+)(?=[^<>]*?>)/$1$3=$2/g;
print $str;
__DATA__
<p0=v0 p1=v1 p2=v2 ....>
Output >>
<v0=p0 v1=p1 v2=p2 ....>
Upvotes: 0
Reputation: 785376
This should work to swap only those pairs between < and >
:
String string = "<p0=v0 p1=v1 p2=v2> a=b c=d xyz=abc <foo=bar baz=bat>";
Pattern pattern1 = Pattern.compile("<[^>]+>");
Pattern pattern2 = Pattern.compile("(\\w+)=(\\w+)");
Matcher matcher1 = pattern1.matcher(string);
StringBuffer sbuf = new StringBuffer();
while (matcher1.find()) {
Matcher matcher2 = pattern2.matcher(matcher1.group());
matcher1.appendReplacement(sbuf, matcher2.replaceAll("$2=$1"));
}
matcher1.appendTail(sbuf);
System.out.println(sbuf);
<v0=p0 v1=p1 v2=p2> a=b c=d xyz=abc <bar=foo bat=baz>
Upvotes: 0
Reputation: 9160
To replace everything between <
and >
(let's call it tag) is - imho - not possible if the same pattern can occur outside the tag.
Instead to replace everything at once, I'd go for two regexes:
String str = "<p1=v1 p2=v2> p3=v3 <p4=v4>";
Pattern insideTag = Pattern.compile("<(.+?)>");
Matcher m = insideTag.matcher(str);
while(m.find()) {
str = str.replace(m.group(1), m.group(1).replaceAll("(\\w*)=(\\w*)", "$2=$1"));
}
System.out.println(str);
//prints: <v1=p1 v2=p2> p3=v3 <v4=p4>
The matcher grabs everything between <
and >
and for each match it replaces the content of the first capturing group with the swapped one on the original string, but only if it matches (\w*)=(\w*)
, of course.
Trying it with
<p1=v1 p2=v2 just some trash> p3=v3 <p4=v4>
gives the output
<v1=p1 v2=p2 just some trash> p3=v3 <v4=p4>
Upvotes: 0
Reputation: 48404
Simple, use groups:
String input = "<p0=v0 p1=v1 p2=v2>";
// |group 1
// ||matches "p" followed by one digit
// || |... followed by "="
// || ||group 2
// || |||... followed by "v", followed by one digit
// || ||| |replaces group 2 with group 1,
// || ||| |re-writes "=" in the middle
System.out.println(input.replaceAll("(p[0-9])=(v[0-9])", "$2=$1"));
Output:
<v0=p0 v1=p1 v2=p2>
Upvotes: 2