Oskar
Oskar

Reputation: 482

Java Transliteration / replacing special characters to match ISO 8859-1 standard (LATIN-1)

I have to replace special characters with standard LATIN-1 characters. I have spend very long time to search possible solutions of that. For now I have code (using ICU4J library) like that:

import com.ibm.icu.text.Transliterator;


public class ApiUtils {
    private static final Transliterator transliterator = Transliterator.getInstance("Any-Latin; Latin-ASCII;  [[:P:]] Remove; NFKD;");

    public static String replaceSpecialCharacters(String text) {
        if (text == null) {
            return null;
        }
        return transliterator.transliterate(text);
    }
}

It works pretty well, for e.g.:

'бвгджзклмнпрстфхцчшщЬ' | 'bvgdzzklmnprstfhccss'
'ÀÂÇÉÔÛ'                | 'AACEOU'

but it misses some special characters like for e.g. ə, c, Ҝ, ҝ, ,Ө, ө, Ү, ү, Ҹ,ҹ. I can make a dictionary of characters like that, but I'm looking for universal solution. Do you know any good java library or possible solution of this?

Upvotes: 0

Views: 621

Answers (0)

Related Questions