Reputation: 659
So basically I'm trying to convert characters from ISO-8859-2 to windows-1250. Unfortunately none of the java encoder/decoder classes seemed to solve my problem.
What I'm doing at the moment is:
str = str.replace("ń", new String(new char[]{241}));
It actually converts the sequence, but not to the correct character.
-59,-124 (ń) becomes -61,-79, isn't it supposed to become either 241 or -24?
Upvotes: 1
Views: 3651
Reputation: 1492
Echo Ingo's answer. Perhaps a chunk of code will demonstrate what is happening:
String defCharset = "ń";
String[] names = { "UTF-16BE", "UTF-8", "ISO-8859-2", "windows-1250" };
for( String name : names ) {
byte[] bytes = defCharset.getBytes( name );
for( int i = 0; i < bytes.length; i++ ) {
System.out.printf( "%s [%d]=%d\n", name, i, (int)( bytes[i] & 0xff ));
}
}
What you should really ask is who is providing the ISO-8859-2 characters, and who wants to consume the windows-1250 characters? Then how will you deal with the byte[] in which they are encoded?
Upvotes: 2
Reputation: 47965
The encoding inside a string is always the same (UTF-16), so your code is confused. It replaces one character with another, it does not translate encodings.
Also, this code depends on the encoding of your source file. It is better to use "\u0144" instead of "ń".
Encodings are realized when converting a string to bytes, like in
str.getBytes("Cp1250")
Upvotes: 2
Reputation: 346240
Wanting to convert a Java String from one encoding to another is fundamentally wrong - Strings are an abstraction of characters, independant from encodings (well, mostly).
In Java, encodings a recipes for converting between bytes and Strings. If you want to convert from ISO-8859-2 to windows-1250, you need to start with bytes, convert them to String using ISO-8859-2 and convert that back to bytes using windows-1250. This can be done either using InputStreamReader/Writer
or new String(bytes, encoding)
and string.getBytes(encoding)
Upvotes: 5