az4dan
az4dan

Reputation: 659

Java character conversion

So basically I'm trying to convert characters from ISO-8859-2 to windows-1250. Unfortunately none of the java encoder/decoder classes seemed to solve my problem.

What I'm doing at the moment is:

str = str.replace("ń", new String(new char[]{241}));

It actually converts the sequence, but not to the correct character.

-59,-124 (ń) becomes -61,-79, isn't it supposed to become either 241 or -24?

Upvotes: 1

Views: 3651

Answers (3)

jbm
jbm

Reputation: 1492

Echo Ingo's answer. Perhaps a chunk of code will demonstrate what is happening:

String defCharset = "ń";
String[] names = { "UTF-16BE", "UTF-8", "ISO-8859-2", "windows-1250" };
for( String name : names ) {
    byte[] bytes = defCharset.getBytes( name );
    for( int i = 0; i < bytes.length; i++ ) {
        System.out.printf( "%s [%d]=%d\n", name, i, (int)( bytes[i] & 0xff ));
    }
}

What you should really ask is who is providing the ISO-8859-2 characters, and who wants to consume the windows-1250 characters? Then how will you deal with the byte[] in which they are encoded?

Upvotes: 2

Ingo Kegel
Ingo Kegel

Reputation: 47965

The encoding inside a string is always the same (UTF-16), so your code is confused. It replaces one character with another, it does not translate encodings.

Also, this code depends on the encoding of your source file. It is better to use "\u0144" instead of "ń".

Encodings are realized when converting a string to bytes, like in

str.getBytes("Cp1250")

Upvotes: 2

Michael Borgwardt
Michael Borgwardt

Reputation: 346240

Wanting to convert a Java String from one encoding to another is fundamentally wrong - Strings are an abstraction of characters, independant from encodings (well, mostly).

In Java, encodings a recipes for converting between bytes and Strings. If you want to convert from ISO-8859-2 to windows-1250, you need to start with bytes, convert them to String using ISO-8859-2 and convert that back to bytes using windows-1250. This can be done either using InputStreamReader/Writer or new String(bytes, encoding) and string.getBytes(encoding)

Upvotes: 5

Related Questions