Reputation: 4751
How to convert any UTF8 strings to readable strings.
Like : ⬠(in UTF8) is €
I tried using Charset but not working.
Upvotes: 1
Views: 15831
Reputation: 7623
You are trying to decode a byteArray encoded with "ISO-8859-15" with "UTF-8" format
b = "Üü?öäABC".getBytes("ISO-8859-15");
u = "Üü?öäABC".getBytes("UTF-8");
System.out.println(new String(b, "ISO-8859-15")); // will be ok
System.out.println(new String(b, "UTF-8")); // will look garbled
System.out.println(new String(u,"UTF-8")); // will be ok
Upvotes: 1
Reputation:
I think the problem here is that you're assuming a java String is encoded with whatever you've specified in the constructor. It's not. It's in UTF-16.
So, "Üü?öäABC".getBytes("ISO-8859-15")
is actually converting a UTF-16 string to ISO-8859-15, and then getting the byte representation of that.
If you want to get the human-readable format in your Eclipse console, just keep it as it is (in UTF-16) - and call System.out.println("Üü?öäABC")
, because your Eclipse console will decode the string and display it as UTF-16.
Upvotes: 0
Reputation: 1648
A string in java is already an unicode representation. When you call one of the getBytes methods on it you get an encoded representation (as bytes, thus binary values) in a specific encoding - ISO-8859-15 in your example. If you want to convert this byte array back to an unicode string you can do that with one of the string constructors accepting a byte array, like you did, but you must do so using the exact same encoding the byte array was originally generated with. Only then you can convert it back to an unicode string (which has no encoding, and doesn't need one).
Beware of the encoding-less methods, both the string constructor and the getBytes method, since they use the default encoding of the platform the code is running on, which might not be what you want to achieve.
Upvotes: 1
Reputation: 140234
This is not "UTF-8" but completely broken and unrepairable data. Strings do not have encodings. It makes no sense to say "UTF-8" string in this context. String is a string of abstract characters - it doesn't have any encodings except as an internal implementation detail that is not our concern and not related to your problem.
Upvotes: 1
Reputation: 4519
You are encoding a string to ISO-8859-15 with byte[] b = "Üü?öäABC".getBytes("ISO-8859-15");
then you are decoding it with UTF-8 System.out.println(new String(b, "UTF-8"));
. You have to decode it the same way with ISO-8859-15.
Upvotes: 1