Ashok Kumar
Ashok Kumar

Reputation: 403

Regex to remove only special characters and not other language letters

I used a regex expression to remove special characters from name. The expression will remove all letters except English alphabets.

public static void main(String args[]) {
    String name = "Özcan Sevim.";
    name = name.replaceAll("[^a-zA-Z\\s]", " ").trim();
    System.out.println(name);
}

Output:

zcan Sevim

Expected Output:

Özcan Sevim 

I get bad result as I did it this way, the right way will be to remove special characters based on ASCII codes so that other letters will not be removed, can someone help me with a regex that would remove only special characters.

Upvotes: 4

Views: 3554

Answers (3)

Michal Jonko
Michal Jonko

Reputation: 1

Use Guava CharMatcher for that :) It will be easier to read and maintain it.

name = CharMatcher.ASCII.negate().removeFrom(name);

Upvotes: 0

Youcef LAIDANI
Youcef LAIDANI

Reputation: 59988

You can use \p{IsLatin} or \p{IsAlphabetic}

name = name.replaceAll("[^\\p{IsLatin}]", " ").trim();

Or to remove the punctuation just use \p{Punct} like this :

name = name.replaceAll("\\p{Punct}", " ").trim();

Outputs

Özcan Sevim

take a look at the full list of Summary of regular-expression constructs and use the one which can help you.

Upvotes: 9

chand mohd
chand mohd

Reputation: 2550

use [\W+] or "[^a-zA-Z0-9]" as regex to match any special characters and also use String.replaceAll(regex, String) to replace the spl charecter with an empty string. remember as the first arg of String.replaceAll is a regex you have to escape it with a backslash to treat em as a literal charcter.

 String string= "hjdg$h&jk8^i0ssh6";
        Pattern pt = Pattern.compile("[^a-zA-Z0-9]");
        Matcher match= pt.matcher(string);
        while(match.find())
        {
            String s= match.group();
        string=string.replaceAll("\\"+s, "");
        }
        System.out.println(string);

Upvotes: -1

Related Questions