Alpharius
Alpharius

Reputation: 509

What should I add to a regular expression to remove punctuation marks that appear more than 1 time?

I am trying to write a regular expression that will remove letters of another language and punctuation marks that occur more than 1 time.

To remove the letters from another language here is the usual expression:

st = test.replaceAll("[^ a-zA-z0-9]" ,  "");

But i don't understand what should i add to it so that it removes not all punctuation marks and spaces ,but only those that occur more than 1 time: String test = new String("agagahh,,,mvf .... AJFKL ???");

I will be glad to help

Input : "agagahh,,,mvf .... AJFKL ???"

Output:"agagahh,mvf . AJFKL ?"

Upvotes: 1

Views: 68

Answers (1)

Unmitigated
Unmitigated

Reputation: 89224

You can first remove all characters that are not alphanumeric or one of the accepted punctuation marks. Then, you can use a capturing group to match a punctuation mark followed by at one or more of the same punctuation mark, to be replaced by a single punctuation mark.

String str = "agagahh,,,mvf ....      AJFKL  ???";
String res = str.replaceAll("[^ a-zA-z0-9.?,]", "").replaceAll("([ .,?])\\1+", "$1");
System.out.println(res);

Upvotes: 1

Related Questions