user3187166
user3187166

Reputation: 55

remove all special character from string java

I wanted to know how I can do to completely eliminate ALL of the special characters from a string. In other words I would leave only the words, thus eliminating any other characters as +-òç@èé etc.

Now i use

myString =  Normalizer.normalize(myString, Normalizer.Form.NFD).replaceAll("[^\\p{ASCII}]", "");

But some characters speacials still remain.

Upvotes: 2

Views: 9316

Answers (2)

Bhuvan Mysore
Bhuvan Mysore

Reputation: 15

The default charset is unicode (utf-8) in java ,The below code uses the unicode representation of a character and checks if the unicode of a character is speicial character; The solution given below is of the Time complexity = O(n);

public class RemoveSpecialCharacters {

/**
 * @param args the command line arguments
 */

private static boolean isSpecialCharacter(int b)
{
    if((b>=32 && b<=47 )||(b>=58 && b<=64)||(b>=91 && b<=96) ||(b>=123 && b<=126)||b>126)
        return true;
    return false;


}
public static String removeSpecialCharacters(String a)
{
    StringBuffer s=new StringBuffer(a);


    int lenvar=s.length();
    String myString="";
    for(int i=0;i<lenvar;i++)
    {


        if(!isSpecialCharacter(s.charAt(i)))
        {
            myString+=s.charAt(i);


        }

    }
    return myString;


}


public static void main(String[] args) {
   System.out.println(removeSpecialCharacters("fleCKHE)_+_+"));



}


}

o/p:fleCKHE

Upvotes: 0

laalto
laalto

Reputation: 152817

Replace the \p{ASCII} regex class with a stricter set that only contains the chars you allow. For example,

myString =  Normalizer.normalize(myString, Normalizer.Form.NFD).replaceAll("[^a-zA-Z]", "");

will first decompose accented chars like é to two parts e + combining ´ (normal form D) and then the regex will remove any character that is not ASCII a..z or A..Z.

Upvotes: 10

Related Questions