DevyDev
DevyDev

Reputation: 886

Remove non-numeric and non-characters symbols from String

So I found many posts where you can use "[^0-9.]" to remove non-numeric characters , and "[^\\p{L}\\s]+" to remove non- characters..

But how to combain these two. ?

If i try something like

replaceAll("[^\\p{L}\\s]+" + "[^0-9.]"

it's not working..

Upvotes: 1

Views: 1315

Answers (1)

Wiktor Stribiżew
Wiktor Stribiżew

Reputation: 626747

Just combine the character classes into one:

s = s.replaceAll("[^\\p{L}\\s0-9.]+", "");

When you add the strings, the resulting regex pattern looks like [^\\p{L}\\s]+[^0-9.] that matches non-character and non-whitespace letters (1 or more occurrences) and 1 non-digit, non-period character.

In your case, you want to match a character (or 1 or more characters) that is not a digit, letter, whitespace, or period. Thus, the two negated character classes should just be merged into 1, not just concatenated.

If you also plan to exclude an underscore, you may try a shorter version:

s = s.replaceAll("[^\\w\\s.]+", "");

as \w matches [\\p{L}0-9_].

Upvotes: 3

Related Questions