Reputation: 3285
I have a table like:
A | 1
A | 2
B | 1
B | 2
B | 3
I'm trying to transform it to look like this:
A { 1 | 2 }
B { 1 | 2 | 3 }
I've come up with this which will match correctly I just can't figure out how to get the repeated capture out.
(A|B)|(\d)(\r\n\1|(\d))*
UPDATE
I realize that this would be fairly trivial with some programming language, I was hoping to learn something more about regular expressions.
Upvotes: 2
Views: 376
Reputation: 383746
This is a Java code that perhaps may be helpful:
String text = "A | 1\n" +
"A | 2\n" +
"B | 1\n" +
"B | 2\n" +
"B | 3\n" +
"A | x\n" +
"D | y\n" +
"D | z\n";
String[] sections = text.split("(?<=(.) . .)\n(?!\\1)");
StringBuilder sb = new StringBuilder();
for (String section : sections) {
sb.append(section.substring(0, 1) + " {")
.append(section.substring(3).replaceAll("\n.", ""))
.append(" }\n");
}
System.out.println(sb.toString());
This prints:
A { 1 | 2 }
B { 1 | 2 | 3 }
A { x }
D { y | z }
The idea is to to do this in two steps:
replaceAll
variantIf you intersperse {
and }
in the input to be captured so they can be rearranged in the output, this is possible with a single replaceAll
(i.e. an entirely regex solution)
String text = "{ A | 1 }" +
"{ A | 2 }" +
"{ B | 1 }" +
"{ B | 2 }" +
"{ B | 3 }" +
"{ C | 4 }" +
"{ D | 5 }";
System.out.println(
text.replaceAll("(?=\\{ (.))(?<!(?=\\1).{7})(\\{)( )(.) .|(?=\\}. (.))(?:(?<=(?=\\5).{6}).{5}|(?<=(.))(.))", "$4$3$2$7$6")
);
This prints (see output on ideone.org):
A { 1 | 2 } B { 1 | 2 | 3 } C { 4 } D { 5 }
Unfortunately no, I don't think this is worth explaining. It's way too complicated for what's being accomplished. Essentially, though, lots of assertions, nested assertions, and capture groups (some of which will be empty strings depending on which assertion passes).
This is, without a doubt, the most complicated regex I've written.
Upvotes: 1