user7415791
user7415791

Reputation: 449

Replace all characters in string beside a-z 0-9 and ,

Hey I want to sanitize a string and only allow it to have a-z A-Z (also other languates, not only english) and , I tried doing the ReplaceAll([^a-z 0-9,]) but it is deleting other languates.. can someone show me how can I manage to only sanitize special characters and also won't delete emojis from it?

Upvotes: 1

Views: 11299

Answers (3)

Charlie
Charlie

Reputation: 3094

I've tested this regular expression and AFAIK it works...

String result = yourString.replaceAll("[^a-zA-Z0-9]", "");

It replaces any character that isn't in the set a-z, A-Z, or 0-9 with nothing.

Upvotes: 1

agiro
agiro

Reputation: 2080

You could try getting the a-z and 0-9 characters' ASCII code, and if the current character is not one of them, do what you wish. On how to get the ascii value of a character, refer here.

EDIT: the idea is that a-z and 0-9 the characters are next to each other. So just write a simple function that returns a boolean whether your current character is one of these, and if not, replace. For this though, you will have to replace one by one.

Upvotes: 1

Davide Lorenzo MARINO
Davide Lorenzo MARINO

Reputation: 26926

In java you can do

yourString.replaceAll("[^\\p{L}\\p{Nd}]+", "");

The regular expression [^\p{L}\p{Nd}]+ match all characters that are no a unicode letter or a decimal number.

If you need only characters (not numbers) you can use the regular expression [^\\p{L}]+ as follow:

yourString.replaceAll("[^\\p{L}]+", "");

Upvotes: 0

Related Questions