user1011394
user1011394

Reputation: 1666

String encoding - Shift_JIS / UTF-8

I get a string from a 3rd party library, which is not well encoded. Unfortunately I'm not allowed to change the library or use another one...

So the actual problem is, that the 3rd party library result string will encode characters like "è ò à ù ì ä ö ü, ..." as SHIFT_JIS (Kanji) inside an UTF-8 string. But only if the character is connected to a word and isn't standalone.

For example:

"Ö Just a simple test"

Standalone

"ÖJust a simple test"

Connected

I tried the following without success:

byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");

UPDATE 1:

That's the content of "resultString".

Note: The byte array shown, is without any modifications (such as getBytes("Shift_JIS"), it's just the resultString as bytes)

enter image description here enter image description here

Do you have any ideas? Any help would be greatly appreciated. Thank you.

Upvotes: 1

Views: 25289

Answers (1)

user1011394
user1011394

Reputation: 1666

Well, very strange:

As

byte[] b = resultString.getBytes("Shift_JIS");
String value = new String(b, "UTF-8");

didn't work for me I tried the following:

String value = new String(resultString.getBytes("SHIFT-JIS"), "UTF-8")

Works like a charm. Maybe it was because of the underscore and lower case character in "Shift_JIS".

Upvotes: 4

Related Questions