Reputation: 21
I am using Apache HttpComponents in a bean inside of Camel to try to write a job to download Apple's metadata database files. This is a list of every song in iTunes. So, obviously it is big. 3.5+ GB. I am trying to use Apache HttpComponents to make an asynchronous get request. However, it seems that the size of the file being returned is too large.
try {
httpclient.start();
FileOutputStream fileOutputStream = new FileOutputStream(download);
//Grab the archive.
URIBuilder uriBuilder = new URIBuilder();
uriBuilder.setScheme("https");
uriBuilder.setHost("feeds.itunes.apple.com");
uriBuilder.setPath("/feeds/epf-flat/v1/full/usa/" + iTunesDate + "/song-usa-" + iTunesDate + ".tbz");
String endpoint = uriBuilder.build().toURL().toString();
HttpGet getCall = new HttpGet(endpoint);
String creds64 = new String(Base64.encodeBase64((user + ":" + password).getBytes()));
log.debug("Auth: " + "Basic " + creds64);
getCall.setHeader("Authorization", "Basic " + creds64);
log.debug("About to download file from Apple: " + endpoint);
Future<HttpResponse> future = httpclient.execute(getCall, null);
HttpResponse response = future.get();
fileOutputStream.write(EntityUtils.toByteArray(response.getEntity()));
fileOutputStream.close();
Every time it return this:
java.util.concurrent.ExecutionException: org.apache.http.ContentTooLongException: Entity content is too long: 3776283429
at org.apache.http.concurrent.BasicFuture.getResult(BasicFuture.java:68)
at org.apache.http.concurrent.BasicFuture.get(BasicFuture.java:77)
at com.decibly.hive.songs.iTunesWrapper.getSongData(iTunesWrapper.java:89)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.camel.component.bean.MethodInfo.invoke(MethodInfo.java:407)
So, the size of the file in bytes is to big for a Java integer, which HttpComponents is using to track the response size. I get that, wondering if there are any workarounds aside from dropping back a layer and calling the Java Net libraries directly.
Upvotes: 0
Views: 2203
Reputation: 181
Use HttpAsyncClient that is build on the top of http components and supports for Zero-Copy transfer.
See an example here: https://hc.apache.org/httpcomponents-asyncclient-4.1.x/httpasyncclient/examples/org/apache/http/examples/nio/client/ZeroCopyHttpExchange.java
Or simply, in your case
CloseableHttpAsyncClient httpclient = HttpAsyncClientBuilder.create()....
ZeroCopyConsumer<File> consumer = new ZeroCopyConsumer<File>(new File(download)) {
@Override
protected File process(
final HttpResponse response,
final File file,
final ContentType contentType) throws Exception {
if (response.getStatusLine().getStatusCode() != HttpStatus.SC_OK) {
throw new ClientProtocolException("Connection to host failed: " + response.getStatusLine());
}
return file;
}
};
httpclient.execute(HttpAsyncMethods.createGet(endpoint), consumer, null, null).get();
The body of the response is directly saved to a file. The only limitation is given by the file system
Upvotes: 2