Reputation: 739
I need to split a string at every i-th and j-th character, where i and j can change according to input parameters. If for example i have an input
String s = "1234567890abcdef";
int i = 2;
int j = 3;
I want my output to be an array of:
[12, 345, 67, 890, ab, cde, f]
I found a compact regex to split at every n-th char. Example for n = 3 using "(?<=\\G...)"
or "(?<=\\G.{3})"
String s = "1234567890abcdef";
int n = 3;
System.out.println(Arrays.toString(s.split("(?<=\\G.{"+n+"})")));
//output: [123, 456, 789, 0ab, cde, f]
How to modify the above regex to split at every 2nd and 3rd char alternately?
A naive chaining like "(?<=\\G.{2})(?<=\\G.{3})"
did not work.
Upvotes: 4
Views: 124
Reputation: 468
There is a somewhat hacky way to split()
using regex, but as @horcrux mentioned:
every match should be aware of the pattern previously matched
You would have to:
a) insert an anchor to make further backreferences by adding a "unlikely" character or string (e.g. line-break) into every i + j position first:
s = s.replaceAll("(.{5})", "$1\n");
So that your string transforms to 12345\n67890\nabcde\nf
b) Now you can split by looking around
String[] result = s.split("(?<=\\G.{2})(?=.{3}\n)|\n");
where you look for a zero-length match having i
characters on the left (?<=\G.{2})
and followed by j
characters ending with your "special" pattern OR just match your "special" pattern if not found.
This allows alternating split either at a position i
or at the match of "special" pattern.
Complete one-liner (for educational purposes only):
System.out.println(Arrays.toString(s.replaceAll("(.{"+(i+j)+"})", "$1#").split("(?<=\\G.{"+i+"})(?=.{"+j+"}#)|#")));
Upvotes: 2
Reputation: 7880
I don't think you can do this with split()
, because every match should be aware of the pattern previously matched.
If you don't want to manually iterate over the string's characters, you can use something like this:
Matcher m = Pattern.compile("(.{0,2})(.{0,3})").matcher("1234567890abcdef");
List<String> list = new ArrayList<>();
while (m.find()) {
for (int i = 1; i <= 2; i++) {
if (!m.group(i).isEmpty()) {
list.add(m.group(i));
}
}
}
System.out.println(list); // prints [12, 345, 67, 890, ab, cde, f]
Upvotes: 5
Reputation: 7880
Here is another simple solution which doesn't make use of regular expressions:
String s = "1234567890abcdef";
int strLen = s.length();
List<String> list = new ArrayList<>();
for (int lastIndex = 0; lastIndex < strLen;) {
int numChars = list.size() % 2 == 0 ? 2 : 3; // this alternates substrings of length 2 and 3
if (strLen - lastIndex < numChars)
list.add(s.substring(lastIndex));
else
list.add(s.substring(lastIndex, lastIndex+numChars));
lastIndex += numChars;
}
System.out.println(list); // prints [12, 345, 67, 890, ab, cde, f]
Upvotes: 2
Reputation: 2776
O(n) solution by iterating over the characters:
private static List<String> splitByPattern(String str, List<Integer> pattern) {
int currentPatternIndex = 0;
int iterationsTillNextSplit = pattern.get(currentPatternIndex);
StringBuilder stringBuilder = new StringBuilder();
List<String> strs = new ArrayList<>();
for (char c : str.toCharArray()) {
if (iterationsTillNextSplit == 0) { // Time to split
strs.add(stringBuilder.toString());
stringBuilder = new StringBuilder();
iterationsTillNextSplit = pattern.get(++currentPatternIndex % pattern.size());
}
stringBuilder.append(c);
iterationsTillNextSplit--;
}
strs.add(stringBuilder.toString());
return strs;
}
Usage:
System.out.println(splitByPattern("1234567890abcdef", Arrays.asList(2, 3)));
Output:
[12, 345, 67, 890, ab, cde, f]
Upvotes: 2