fred
fred

Reputation: 1822

Convert iso8859-1 to utf8 in java

For an iso8859-1 encoded String s, what is the most elegant way to convert it to utf8?

String convertedString = new String(s.getBytes("UTF-8"), "UTF-8"); //is this correct, elegant etc?

NOTE I know that there are already questions similar to this one, but they ones I've found have ambiguous answers and do not show the whole conversion.

EDIT: more detalied description of my problem

//message is a String
//msg.setContent is this method http://docs.oracle.com/javaee/6/api/javax/mail/internet/MimeMessage.html#setContent%28java.lang.Object,%20java.lang.String%29

msg.setContent(message, "text/plain"); 
msg.addHeader("Content-Type", "text/plain; charset=\"utf-8\"");

When this is received in a mail client, the header says utf8 but the content (i.e. the message String) is actually iso8859-1 encoded, which leads to characters such as åäö being incorrectly rendered. What I'd like to know is how to make the contents utf8 encoded.

EDIT II: (answer) Turns out it was the MimeMessage.java class that set the encoding to iso8859-1 and instead of using MimeMessage.setContent there is another method MimeMessage.setText(String text, String charset); which allowed me to set encoding to utf8.

Upvotes: 0

Views: 1104

Answers (2)

Evgeniy Dorofeev
Evgeniy Dorofeev

Reputation: 136162

No, it's not correct. String is always in UTF-16. You can encode / decode only byte array.

Upvotes: 0

fge
fge

Reputation: 121840

You don't convert a string from one encoding to another. A String is a series of chars, and that's it. For what it's worth, it could be a series of carrier pigeons. Pigeons don't have an encoding. Neither do chars.

What you do is convert it to bytes when using a Writer. (or read from bytes when using a Reader). It is at this point that the encoding (a Charset) matters.

Upvotes: 3

Related Questions