Reputation: 31
I try to use post data in Big5 and get the like:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="zh-TW">
The java statement is like:
Document docs = Jsoup.connect(param)
.timeout(30000)
.postDataCharset("Big5")
.data("syear","104")
.data("smonth","6")
.data("sday","30")
.data("eyear","104")
.data("emonth","7")
.data("eday","17")
.data("SectNO", "不限科別")
.data("EmpNO", "不限醫生")
.post();
How to set charset for sending data to get response?
Upvotes: 1
Views: 953
Reputation: 43053
As of Jsoup 1.8.3, postDataCharset()
sets the charset of data posted. This charset isn't reused when it comes to parse the data read.
Instead, Jsoup tries to find somehow a meta http-equiv specifying the charset. If it can't find, it assumes by default that the charset is UTF-8. In your case, this assumption is wrong.
To workaround this, don't let Jsoup guess the data encoding for you. Here is how to do it:
// Let Jsoup fetch the data
Response res = Jsoup.connect(param) //
.timeout(30000) //
.postDataCharset("Big5") //
.data("syear", "104") //
.data("smonth", "6") //
.data("sday", "30") //
.data("eyear", "104") //
.data("emonth", "7") //
.data("eday", "17") //
.data("SectNO", "不限科別") //
.data("EmpNO", "不限醫生") //
.execute();
// Now, we tell it explicitly which encoding to use
Document docs = Jsoup.parse(
new String(res.bodyAsBytes(), "Big5"), //
param //
);
Upvotes: 0