Reputation: 55
I wanted to know how I can do to completely eliminate ALL of the special characters from a string. In other words I would leave only the words, thus eliminating any other characters as +-òç@èé etc.
Now i use
myString = Normalizer.normalize(myString, Normalizer.Form.NFD).replaceAll("[^\\p{ASCII}]", "");
But some characters speacials still remain.
Upvotes: 2
Views: 9316
Reputation: 15
The default charset is unicode (utf-8) in java ,The below code uses the unicode representation of a character and checks if the unicode of a character is speicial character; The solution given below is of the Time complexity = O(n);
public class RemoveSpecialCharacters {
/**
* @param args the command line arguments
*/
private static boolean isSpecialCharacter(int b)
{
if((b>=32 && b<=47 )||(b>=58 && b<=64)||(b>=91 && b<=96) ||(b>=123 && b<=126)||b>126)
return true;
return false;
}
public static String removeSpecialCharacters(String a)
{
StringBuffer s=new StringBuffer(a);
int lenvar=s.length();
String myString="";
for(int i=0;i<lenvar;i++)
{
if(!isSpecialCharacter(s.charAt(i)))
{
myString+=s.charAt(i);
}
}
return myString;
}
public static void main(String[] args) {
System.out.println(removeSpecialCharacters("fleCKHE)_+_+"));
}
}
o/p:fleCKHE
Upvotes: 0
Reputation: 152817
Replace the \p{ASCII}
regex class with a stricter set that only contains the chars you allow. For example,
myString = Normalizer.normalize(myString, Normalizer.Form.NFD).replaceAll("[^a-zA-Z]", "");
will first decompose accented chars like é
to two parts e
+ combining ´
(normal form D) and then the regex will remove any character that is not ASCII a..z or A..Z.
Upvotes: 10