Reputation: 840
I'm searching for a way to delete each 4th occurrence of a character (a-zA-Z) in a row.
For example, if I have the following string:
helloooo I am veeeeeeeeery busy right nowww because I am working veeeeeery hard
I want delete all 4th, 5th, 6th, ... characters in a row. But, in the word hard
, a 4th r
occurs, which I do NOT want to delete, because it is not the 4th r
in a row / it is surrounded with other characters. The result should be:
hellooo I am veeery busy right nowww because I am working veeery hard
I have already searched for a way to do this, and I could have found a way to replace/delete the 4th occurrence of a character, but I could not find a way to replace/delete the 4th occurrence of a character in a row.
Thanks in advance.
Upvotes: 2
Views: 108
Reputation: 4874
The regex you want is ((.)\2{2})\2*
. Not quite sure what that is in Java-ese, but what it does is match any single character and then 2 additional instances of that character, followed by any number of additional instances. Then replace it with the contents of the first capture group (\1
) and you're good to go.
Upvotes: 2
Reputation: 4286
The function may be written like this:
public static String transform(String input) {
if (input.isEmpty()) {
return input;
} else {
final StringBuilder sb = new StringBuilder();
char lastChar = '\0';
int duplicates = 0;
for (int i = 0; i < input.length(); i++) {
final char curChar = input.charAt(i);
if (curChar == lastChar) {
duplicates++;
if (duplicates < 3) {
sb.append(curChar);
}
} else {
sb.append(curChar);
lastChar = curChar;
duplicates = 0;
}
}
return sb.toString();
}
}
I think it's faster than regex.
Upvotes: 3
Reputation: 785846
In Java you can use this replacement based on back-references:
str = str.replaceAll("(([a-zA-Z])\\2\\2)\\2+", "$1");
Upvotes: 2