Reputation: 19195
How to replace accented characters with plain alphabet characters?
Before you mark this question as duplicate:
I tried various solutions but none worked for me.
See the following code:
import org.apache.commons.lang3.StringUtils;
import java.text.Normalizer;
import java.util.regex.Pattern;
public class AccentsTest
{
public static void main(String[] arguments)
{
String textWithAccents = "Et ça sera sa moitié.";
System.out.println(textWithAccents);
System.out.println(stripAccents(textWithAccents));
System.out.println(deAccent(textWithAccents));
System.out.println(normalize(textWithAccents));
System.out.println(stripAccents2(textWithAccents));
}
// http://stackoverflow.com/a/15191069/3764804
public static String stripAccents(String s)
{
return StringUtils.stripAccents(s);
}
// http://stackoverflow.com/a/1215117/3764804
public static String deAccent(String str)
{
String nfdNormalizedString = Normalizer.normalize(str, Normalizer.Form.NFD);
Pattern pattern = Pattern.compile("\\p{InCombiningDiacriticalMarks}+");
return pattern.matcher(nfdNormalizedString).replaceAll("");
}
// http://stackoverflow.com/a/8523728/3764804
public static String normalize(String string)
{
string = Normalizer.normalize(string, Normalizer.Form.NFD);
string = string.replaceAll("[^\\p{ASCII}]", "");
return string;
}
// http://stackoverflow.com/a/15190787/3764804
public static String stripAccents2(String s)
{
s = Normalizer.normalize(s, Normalizer.Form.NFD);
s = s.replaceAll("[\\p{InCombiningDiacriticalMarks}]", "");
return s;
}
}
It outputs:
Et ?a sera sa moiti?.
Et ?a sera sa moiti?.
Et ?a sera sa moiti?.
Et a sera sa moiti.
Et ?a sera sa moiti?.
However, I want it to output the text in plain alphabet characters which would be the following:
Et ca sera sa moitie.
How can it be done? Is something wrong with my IDE? I'm using IntelliJ
.
Upvotes: 1
Views: 1946
Reputation: 19195
It was an encoding issue. If I change the .java
source file's encoding to UTF-8
instead of windows-1252
the code examples all work properly by outputting the expected text.
Upvotes: 1