Dili
Dili

Reputation: 626

how to fix corrupt japanese character encoding

i have the following string that i know is suppose to be displayed as Japanese text

25“ú‚¨“¾‚ȃAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O

is there any way to decode and re-encode the text so it displays properly? i already tried using shift-jis but it did not produce a readable string.

string main = "25“ú‚¨“¾‚ȃAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O.zip"; byte[] mainBytes = System.Text.Encoding.GetEncoding("shift-jis").GetBytes(main); string jpn = System.Text.Encoding.GetEncoding("shift-jis").GetString(mainBytes);

thanks!

Upvotes: 0

Views: 4461

Answers (1)

rodrigo
rodrigo

Reputation: 98456

I think that the original is Shift-JIS, but you didn't show how you did try. So here is my try to re-code it::

string s1 = "25“ú‚¨“¾‚ȃAƒ‹ƒeƒBƒƒbƒgƒRƒXƒZƒbƒg‹L”O";
byte[] bs = Encoding.GetEncoding(1252).GetBytes(s1);
string s2 = Encoding.GetEncoding(932).GetString(bs);

And s2 is now "25日お得なアルティャbトコスセット記念", that looks a lot more like Japanese.

What I assume it that some byte array that represent text Shift-JIS encoded, what read by using a different encoding, maybe Windows-1252. So first I try to get back the original byte array. Then I use the proper encoding to get the correct text.

A few notes about my code:

  • 1252 is the numeric ID for Windows-1252, the most usually used-by-mistake encoding. But this is just a guess, you can try with other encodings and see if it makes more sense.
  • 932 is de numeric ID for Shift-JIS (you can also use the string name). This is also a guess, but likely right.
  • Take into account that using a wrong encoding is not generally a reversible procedure so there may be characters that are lost in the translation.

Upvotes: 2

Related Questions