Reputation: 22769
I know that okhttp3
library by default it adds the header Accept-Encoding: gzip
and decodes the response automatically for us.
The problem I'm dealing with a host that only accepts a header like: Accept-Encoding: gzip, deflate
if I don't add the deflate
part it fails. Now when I manually add that header to okhttp client, the library doesn't do the decompression anymore for me.
I've tried multiple solutions to take the response and try to manually decompress that but I've always ended up with an exception i.e. java.util.zip.ZipException: Not in GZIP format
, here's what I've tried so far:
//decompresser
public static String decompressGZIP(InputStream inputStream) throws IOException
{
InputStream bodyStream = new GZIPInputStream(inputStream);
ByteArrayOutputStream outStream = new ByteArrayOutputStream();
byte[] buffer = new byte[4096];
int length;
while ((length = bodyStream.read(buffer)) > 0)
{
outStream.write(buffer, 0, length);
}
return new String(outStream.toByteArray());
}
//run scraper
scrape(api, new Callback()
{
// Something went wrong
@Override
public void onFailure(@NonNull Call call, @NonNull IOException e)
{
}
@Override
public void onResponse(@NonNull Call call, @NonNull Response response) throws IOException
{
if (response.isSuccessful())
{
try
{
InputStream responseBodyBytes = responseBody.byteStream();
returnedObject = GZIPCompression.decompress(responseBodyBytes);
if (returnedObject != null)
{
String htmlResponse = returnedObject.toString();
}
}
catch (ProtocolException e){}
if(response != null) response.close();
}
}
});
private Call scrape(Map<?, ?> api, Callback callback)
{
MediaType JSON = MediaType.parse("application/json; charset=utf-8");
String method = (String) api.get("method");
String url = (String) api.get("url");
Request.Builder requestBuilder = new Request.Builder().url(url);
RequestBody requestBody;
requestBuilder.header("User-Agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:61.0) Gecko/20100101 Firefox/61.0");
requestBuilder.header("Accept", "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8");
requestBuilder.header("Accept-Language", "en-US,en;q=0.5");
requestBuilder.header("Accept-Encoding", "gzip, deflate");
requestBuilder.header("Connection", "keep-alive");
requestBuilder.header("Upgrade-Insecure-Requests", "1");
requestBuilder.header("Cache-Control", "max-age=0");
Request request = requestBuilder.build();
Call call = client.newCall(request);
call.enqueue(callback);
return call;
}
Just a note, the response headers will always return Content-Encoding: gzip
and Transfer-Encoding: chunked
One more thing, I've also tried the solution in this topic and it still fails with D/OkHttp: java.io.IOException: ID1ID2: actual 0x00003c68 != expected 0x00001f8b
.
Any help would be appreciated..
Upvotes: 10
Views: 16930
Reputation: 41
I had to implement this myself recently and found that existing answers had a few errors, so here's my take with how it works today.
import java.util.Collections;
import java.util.zip.Inflater;
import okhttp3.Headers;
import okhttp3.Interceptor;
import okhttp3.MediaType;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.RequestBody;
import okhttp3.Response;
import okhttp3.ResponseBody;
import okio.BufferedSource;
import okio.GzipSource;
import okio.InflaterSource;
import okio.Okio;
var client = new OkHttpClient.Builder()
.addInterceptor(
(Interceptor.Chain chain) -> {
var oldRequest = chain.request();
// If the caller has passed their own Accept-Encoding
// it's indicating they expect to handle it themself.
if (oldRequest.header("Accept-Encoding") != null) {
return chain.proceed(oldRequest);
}
// Augment request saying we accept multiple content encodings
var newHeaders =
oldRequest
.headers()
.newBuilder()
.add("Accept-Encoding", "deflate")
.add("Accept-Encoding", "gzip")
.build();
var newRequest = oldRequest.newBuilder().headers(newHeaders).build();
var oldResponse = chain.proceed(newRequest);
// Replace the response's request with the original one
var responseBuilder = oldResponse.newBuilder().request(oldRequest);
// We might not have a body to decompress
var body = oldResponse.body();
if (body != null) {
BufferedSource source = body.source();
// The body may have been wrapped in an arbitrary encoding sequence
// and the server returns them in the order it encoded them
// so we wrap them with decoders in reverse order.
var encodings = oldResponse.headers().values("Content-Encoding");
Collections.reverse(encodings);
for (var encoding : encodings) {
if ("deflate".equalsIgnoreCase(encoding)) {
var inflater = new Inflater(true);
source = Okio.buffer(new InflaterSource(source, inflater));
} else if ("gzip".equalsIgnoreCase(encoding)) {
source = Okio.buffer(new GzipSource(source));
}
}
// Strip encoding and length headers as we've already handled them
var strippedHeaders =
oldResponse
.headers()
.newBuilder()
.removeAll("Content-Encoding")
.removeAll("Content-Length")
.build();
responseBuilder.headers(strippedHeaders);
var contentType = MediaType.parse(oldResponse.header("Content-Type"));
// Construct a new body with an inferred Content-Length
var newBody = ResponseBody.create(contentType, -1L, source);
responseBuilder.body(newBody);
}
return responseBuilder.build();
})
.build();
Upvotes: 0
Reputation: 168
Thank you for Aksenov Vladimir`s reply. Your answer saved me a lot of time. Everything is working fine after I upgraded okhttp from 3.x to 4.11.
Here are some additional details:
The relevant code is as follows: okhttp3.internal.http.BridgeInterceptor
// If we add an "Accept-Encoding: gzip" header field we're responsible for also decompressing
// the transfer stream.
var transparentGzip = false
if (userRequest.header("Accept-Encoding") == null && userRequest.header("Range") == null) {
transparentGzip = true
requestBuilder.header("Accept-Encoding", "gzip")
}
if (transparentGzip &&
"gzip".equals(networkResponse.header("Content-Encoding"), ignoreCase = true) &&
networkResponse.promisesBody()) {
val responseBody = networkResponse.body
if (responseBody != null) {
val gzipSource = GzipSource(responseBody.source())
val strippedHeaders = networkResponse.headers.newBuilder()
.removeAll("Content-Encoding")
.removeAll("Content-Length")
.build()
responseBuilder.headers(strippedHeaders)
val contentType = networkResponse.header("Content-Type")
responseBuilder.body(RealResponseBody(contentType, -1L, gzipSource.buffer()))
}
}
Upvotes: 1
Reputation: 1
Because okhttp
does not support deflate
in BridgeInterceptor.java or BridgeInterceptor.kt
if (transparentGzip &&
"gzip".equals(networkResponse.header("Content-Encoding"), ignoreCase = true) &&
networkResponse.promisesBody()) {
Upvotes: 0
Reputation: 707
Version 4.10.0
can already do it automatically if your header contains gzip
Upvotes: 1
Reputation: 22769
After 6 hours of digging I found the correct solution and as usual it was easier than I thought, so I was basically trying to decompress a page that's not gzipped for that reason it was failing. Now once I hit the second page (which is compressed) I get a gzipped response where the code above should handle it. Also if anyone wants the solution I used a modified interceptor just like the one in this answer so you don't need to use a custom function to handle the decompression.
I modified the unzip
method to make the okhttp interceptor
work with compressed and uncompressed responses:
OkHttpClient.Builder clientBuilder = new OkHttpClient.Builder().addInterceptor(new UnzippingInterceptor());
OkHttpClient client = clientBuilder.build();
And the Interceptor is like dis:
private class UnzippingInterceptor implements Interceptor {
@Override
public Response intercept(Chain chain) throws IOException {
Response response = chain.proceed(chain.request());
return unzip(response);
}
// copied from okhttp3.internal.http.HttpEngine (because is private)
private Response unzip(final Response response) throws IOException {
if (response.body() == null)
{
return response;
}
//check if we have gzip response
String contentEncoding = response.headers().get("Content-Encoding");
//this is used to decompress gzipped responses
if (contentEncoding != null && contentEncoding.equals("gzip"))
{
Long contentLength = response.body().contentLength();
GzipSource responseBody = new GzipSource(response.body().source());
Headers strippedHeaders = response.headers().newBuilder().build();
return response.newBuilder().headers(strippedHeaders)
.body(new RealResponseBody(response.body().contentType().toString(), contentLength, Okio.buffer(responseBody)))
.build();
}
else
{
return response;
}
}
}
Upvotes: 25