Hurve
Hurve

Reputation: 173

Conversion between character encoding in java

I cannot find out how to do the conversion below

String s = "Här har du!  â\u0080\u0093 Hur väl kan du snacka?";
t = convert(s);
// t should be "Här har du! â Hur väl kan du snacka?"

I cannot find how to translate s into t. Anybody knows how to do this in Java?

Upvotes: 2

Views: 1762

Answers (2)

Semih Eker
Semih Eker

Reputation: 2409

Try sthg like this;

     String s = "Här har du!  â\u0080\u0093 Hur väl kan du snacka?";        
     byte[] bytes = s.getBytes("ISO-8859-1");
     String str  = new String(bytes, "UTF-8");

Output is ;

    Här har du!  – Hur väl kan du snacka?

For below code;

public static void main (String[] args) throws java.lang.Exception
{
     String s = "Här har du!  â\u0080\u0093 Hur väl kan du snacka?";        
     byte[] bytes = s.getBytes("ISO-8859-1");
     String str  = new String(bytes, "UTF-8");
     System.out.println(str);
}

Upvotes: 3

jtahlborn
jtahlborn

Reputation: 53664

As i already mentioned in my comment, it looks like your String s is already corrupted. the correct solution is to fix wherever you got s from in the first place. it seems like you are interpreting what is really a "UTF-8" encoded String using some single byte encoding ("ISO8859-1" seems to work on your test string).

Provided you haven't already lost data in the original string corruption, you can somewhat patch your current string using:

    String s = "Här har du!  â\u0080\u0093 Hur väl kan du snacka?";        
    byte[] b = s.getBytes("ISO-8859-1");
    String t = new String(b, "UTF-8");

Upvotes: 1

Related Questions