Reputation: 509
I am trying to write a regular expression that will remove letters of another language and punctuation marks that occur more than 1 time.
To remove the letters from another language here is the usual expression:
st = test.replaceAll("[^ a-zA-z0-9]" , "");
But i don't understand what should i add to it so that it removes not all punctuation marks and spaces ,but only those that occur more than 1 time:
String test = new String("agagahh,,,mvf .... AJFKL ???");
I will be glad to help
Input : "agagahh,,,mvf .... AJFKL ???"
Output:"agagahh,mvf . AJFKL ?"
Upvotes: 1
Views: 68
Reputation: 89224
You can first remove all characters that are not alphanumeric or one of the accepted punctuation marks. Then, you can use a capturing group to match a punctuation mark followed by at one or more of the same punctuation mark, to be replaced by a single punctuation mark.
String str = "agagahh,,,mvf .... AJFKL ???";
String res = str.replaceAll("[^ a-zA-z0-9.?,]", "").replaceAll("([ .,?])\\1+", "$1");
System.out.println(res);
Upvotes: 1