Reputation: 449
Hey I want to sanitize a string and only allow it to have a-z A-Z (also other languates, not only english) and , I tried doing the ReplaceAll([^a-z 0-9,])
but it is deleting other languates.. can someone show me how can I manage to only sanitize special characters and also won't delete emojis from it?
Upvotes: 1
Views: 11299
Reputation: 3094
I've tested this regular expression and AFAIK it works...
String result = yourString.replaceAll("[^a-zA-Z0-9]", "");
It replaces any character that isn't in the set a-z, A-Z, or 0-9 with nothing.
Upvotes: 1
Reputation: 2080
You could try getting the a-z and 0-9 characters' ASCII code, and if the current character is not one of them, do what you wish. On how to get the ascii value of a character, refer here.
EDIT: the idea is that a-z and 0-9 the characters are next to each other. So just write a simple function that returns a boolean
whether your current character is one of these, and if not, replace.
For this though, you will have to replace one by one.
Upvotes: 1
Reputation: 26926
In java you can do
yourString.replaceAll("[^\\p{L}\\p{Nd}]+", "");
The regular expression [^\p{L}\p{Nd}]+
match all characters that are no a unicode letter or a decimal number.
If you need only characters (not numbers) you can use the regular expression [^\\p{L}]+
as follow:
yourString.replaceAll("[^\\p{L}]+", "");
Upvotes: 0