Reputation: 3295
I am using HttpClient 4.1 to download a web page. I would like to get a compressed version:
HttpGet request = new HttpGet(url);
request.addHeader("Accept-Encoding", "gzip,deflate");
HttpResponse response = httpClient.execute(request,localContext);
HttpEntity entity = response.getEntity();
response.getFirstHeader("Content-Encoding")
shows "Content-Encoding: gzip"
however, entity.getContentEncoding()
is null
.
If I put:
entity = new GzipDecompressingEntity(entity);
I get:
java.io.IOException: Not in GZIP format
It looks like the resulting page is plain text and not compressed even though "Content-Encoding" header shows it's gzipped.
I have tried this on several URLs (from different websites) but get the same results.
How can I get a compressed version of a web page?
Upvotes: 1
Views: 241
Reputation: 382150
Don't use HttpClient if you don't want your API to handle mundane things like unzipping.
You can use the basic URLConnection class to fetch the compressed stream, as demonstrated by the following code :
public static void main(String[] args) {
try {
URL url = new URL("http://code.jquery.com/jquery-latest.js");
URLConnection con = url.openConnection();
// comment next line if you want to have something readable in your console
con.addRequestProperty("Accept-Encoding", "gzip,deflate");
BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
String l;
while ((l=in.readLine())!=null) {
System.out.println(l);
}
} catch (Exception e) {
e.printStackTrace();
}
}
Upvotes: 1