Reputation: 1666
I get a string from a 3rd party library, which is not well encoded. Unfortunately I'm not allowed to change the library or use another one...
So the actual problem is, that the 3rd party library result string will encode characters like "è ò à ù ì ä ö ü, ..." as SHIFT_JIS (Kanji) inside an UTF-8 string. But only if the character is connected to a word and isn't standalone.
For example:
"Ö Just a simple test"
"ÖJust a simple test"
I tried the following without success:
byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
UPDATE 1:
That's the content of "resultString".
Note: The byte array shown, is without any modifications (such as getBytes("Shift_JIS"), it's just the resultString as bytes)
Do you have any ideas? Any help would be greatly appreciated. Thank you.
Upvotes: 1
Views: 25289
Reputation: 1666
Well, very strange:
As
byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");
didn't work for me I tried the following:
String value = new String(resultString.getBytes("SHIFT-JIS"), "UTF-8")
Works like a charm. Maybe it was because of the underscore and lower case character in "Shift_JIS".
Upvotes: 4