user2517294
user2517294

Reputation: 11

How to convert UTF-8 to GBK string in java

I retrieved HTML string from an objective site and within it there is a section

class="f9t" name="Óû§Ãû:ôâÈ»12" 

I know it's in GBK encoding, as I can see it from the FF browser display. But I do not know how to convert that name string into a readable GBK string (such as 上海 or 北京).

I am using

String sname = new String(name.getBytes(), "UTF-8");
byte[] gbkbytes = sname.getBytes("gb2312");
String gbkStr = new String( gbkbytes );
System.out.println(gbkStr);

but it's not printed right in GBK text

???¡ì??:????12

I have no clue how to proceed.

Upvotes: 1

Views: 12975

Answers (2)

rml
rml

Reputation: 833

You can try this if you already read the name with a wrong encoding and get the wrong name value "Óû§Ãû:ôâÈ»12", as @Karol S suggested:

new String(name.getBytes("ISO-8859-1"), "GBK")

Or if you read a GBK or GB2312 string from internet or a file, use something like this to get the right string at the first place:

BufferedReader r = new BufferedReader(new InputStreamReader(is,"GBK")); name = r.readLine();

Upvotes: 5

Nikolay
Nikolay

Reputation: 81

Assuming that name.getBytes() returns GBK encoded string it's enough to create string specifying encoding of array of bytes:

new String(gbkString.getBytes(), "GBK");

Regarding to documentation the name of encryption should be GBK.

Sample code:

String gbkString = "Óû§Ãû:ôâÈ»12";
String utfString = new String(gbkString.getBytes(), "GBK");
System.out.println(utfString);

Result (not 100% sure that it's correct :) ): 脫脙禄搂脙没:么芒脠禄12

Upvotes: 0

Related Questions