Reputation: 116
I want to remove characters that repeat more than twice in a word. For example
"hhaaappppyyyyyyy mmoooooorning friendsssssssssssssss, good goood day"
to
"hhaappyy mmoorning friendss, good good day"
I have tried something like this, but it is not reducing to exactly 2 repetitions.
gsub('([[:alpha:]])\\1{2}', '\\1',
'hhaaappppyyyyyyy mmoooooorning friendsssssssssssssss, good goood day')
#[1] "hhappyyy mmoorning friendsssss, good god day"
Thank you.
Upvotes: 4
Views: 228
Reputation: 1
package test.com;
public class limitCharCount {
public static void main(String[] args) {
// TODO Auto-generated method stub
String str = "gggkjjkjkjjjjjsssslklkkkkkk";
char ch[] = str.toCharArray();
String Test = "";
//int count = 2;
for (int i = 0; i < ch.length - 1; i++) {
if (i == 0 ||i == 1)
Test = Test + ch[i];
else if (!(ch[i]==ch[i-1] && ch[i]==ch[i-2]) )
{
Test = Test + ch[i];
}
}
System.out.println(Test); }
} output ::ggkjjkjkjjsslklkk
Upvotes: 0
Reputation: 639
fwiw, here is another solution:
f = function(x){
x = strsplit(x, '')[[1]]
x = rle(x)
x = rep(x$values, pmin(2, x$lengths))
paste(x, collapse='')
}
example:
x = "hhaaappppyyyyyyy mmoooooorning friendsssssssssssssss, good goood day"
f(x)
[1] "hhaappyy mmoorning friendss, good good day"
however, gsub is a little easier...
Upvotes: 1
Reputation: 562
Same as from Wiktor Stribiżew, but in javascript and replace every character (numbers, punctuation also), if you need this.
var sInput = "hhaaappppyyyyyyy mmoooooorning friendsssssssssssssss, good goood day";
var sOutput = sInput.replace(/(.)\1{2,}/g, "$1$1");
console.log(sOutput);
Upvotes: 1
Reputation: 626728
You need to use {2,}
quantifier and use two \1
in the replacement:
s<-'hhaaappppyyyyyyy mmoooooorning friendsssssssssssssss, good goood day'
gsub('([[:alpha:]])\\1{2,}', '\\1\\1', s)
# => [1] "hhaappyy mmoorning friendss, good good day"
See the R demo.
The ([[:alpha:]])\\1{2,}
pattern matches and captures a letter into Group 1 and then 2 or more repetitions of the same char are matched. Two \1
in the replacement pattern replace the whole match with 2 occurrences of the char. It is valid to use two \1
placeholders because every match is at least 3 identical chars.
Upvotes: 6