Reputation: 886
So I found many posts where you can use "[^0-9.]"
to remove non-numeric characters , and "[^\\p{L}\\s]+"
to remove non- characters..
But how to combain these two. ?
If i try something like
replaceAll("[^\\p{L}\\s]+" + "[^0-9.]"
it's not working..
Upvotes: 1
Views: 1315
Reputation: 626747
Just combine the character classes into one:
s = s.replaceAll("[^\\p{L}\\s0-9.]+", "");
When you add the strings, the resulting regex pattern looks like [^\\p{L}\\s]+[^0-9.]
that matches non-character and non-whitespace letters (1 or more occurrences) and 1 non-digit, non-period character.
In your case, you want to match a character (or 1 or more characters) that is not a digit, letter, whitespace, or period. Thus, the two negated character classes should just be merged into 1, not just concatenated.
If you also plan to exclude an underscore, you may try a shorter version:
s = s.replaceAll("[^\\w\\s.]+", "");
as \w
matches [\\p{L}0-9_]
.
Upvotes: 3