Reputation: 1822
For an iso8859-1 encoded String s, what is the most elegant way to convert it to utf8?
String convertedString = new String(s.getBytes("UTF-8"), "UTF-8"); //is this correct, elegant etc?
NOTE I know that there are already questions similar to this one, but they ones I've found have ambiguous answers and do not show the whole conversion.
EDIT: more detalied description of my problem
//message is a String
//msg.setContent is this method http://docs.oracle.com/javaee/6/api/javax/mail/internet/MimeMessage.html#setContent%28java.lang.Object,%20java.lang.String%29
msg.setContent(message, "text/plain");
msg.addHeader("Content-Type", "text/plain; charset=\"utf-8\"");
When this is received in a mail client, the header says utf8 but the content (i.e. the message String) is actually iso8859-1 encoded, which leads to characters such as åäö being incorrectly rendered. What I'd like to know is how to make the contents utf8 encoded.
EDIT II: (answer) Turns out it was the MimeMessage.java class that set the encoding to iso8859-1 and instead of using MimeMessage.setContent there is another method MimeMessage.setText(String text, String charset); which allowed me to set encoding to utf8.
Upvotes: 0
Views: 1104
Reputation: 136162
No, it's not correct. String is always in UTF-16. You can encode / decode only byte array.
Upvotes: 0
Reputation: 121840
You don't convert a string from one encoding to another. A String
is a series of char
s, and that's it. For what it's worth, it could be a series of carrier pigeons. Pigeons don't have an encoding. Neither do char
s.
What you do is convert it to bytes when using a Writer
. (or read from bytes when using a Reader
). It is at this point that the encoding (a Charset
) matters.
Upvotes: 3