Reputation: 11
I retrieved HTML string from an objective site and within it there is a section
class="f9t" name="Óû§Ãû:ôâÈ»12"
I know it's in GBK encoding, as I can see it from the FF browser display. But I do not know how to convert that name string into a readable GBK string (such as 上海 or 北京).
I am using
String sname = new String(name.getBytes(), "UTF-8");
byte[] gbkbytes = sname.getBytes("gb2312");
String gbkStr = new String( gbkbytes );
System.out.println(gbkStr);
but it's not printed right in GBK text
???¡ì??:????12
I have no clue how to proceed.
Upvotes: 1
Views: 12975
Reputation: 833
You can try this if you already read the name with a wrong encoding and get the wrong name value "Óû§Ãû:ôâÈ»12", as @Karol S suggested:
new String(name.getBytes("ISO-8859-1"), "GBK")
Or if you read a GBK or GB2312 string from internet or a file, use something like this to get the right string at the first place:
BufferedReader r = new BufferedReader(new InputStreamReader(is,"GBK")); name = r.readLine();
Upvotes: 5
Reputation: 81
Assuming that name.getBytes()
returns GBK encoded string it's enough to create string specifying encoding of array of bytes:
new String(gbkString.getBytes(), "GBK");
Regarding to documentation the name of encryption should be GBK
.
Sample code:
String gbkString = "Óû§Ãû:ôâÈ»12";
String utfString = new String(gbkString.getBytes(), "GBK");
System.out.println(utfString);
Result (not 100% sure that it's correct :) ): 脫脙禄搂脙没:么芒脠禄12
Upvotes: 0