Download file from URL and upload it to AWS S3 without saving into memory using AWS SDK for Java, version 2

Question

I am writing a code that will download a file from URL and upload it to S3, but I don't want it to be stored temporarily in file or memory, I am downloading through 'InputStream' but AWS s3 requires the file size which I don't have from 'InputStream' is there any other way. I found the this discussion on same topic using 'Node.js'

My Code to Fetch the file in inputStream

HttpClient client = HttpClient.newBuilder().build();
URI uri = URI.create("{myUrl}");
HttpRequest request = HttpRequest.newBuilder().uri(uri).build();
InputStream is = client.send(request, HttpResponse.BodyHandlers.ofInputStream()).body();

Code I tried to insert into S3, but I don't have content_length

S3Client s3Client = S3Client.builder().build();
PutObjectRequest objectRequest = PutObjectRequest.builder()
                            .bucket(BUCKET_NAME)
                            .key(KEY)
                            .build();

PutObjectResponse por = s3Client.putObject(objectRequest, RequestBody.fromInputStream(is,content_length));

Parsifal · Accepted Answer

You have a few options.

The easiest is to retain the HttpResponse from your client.send(), and get the Content-Length header from it. You should also be looking for headers like Content-Type, and storing them as metadata on the S3 object.

That isn't guaranteed to work in all cases: some servers do not provide Content-Length. In that case you need to create a multipart upload to send the file. When doing this, you buffer relatively small chunks (minimum 5 MB) in memory but can upload up to 10,000 chunks. You must either complete or abort the upload, or configure your bucket to delete uncompleted uploads after a certain period of time; if not, you'll be charged for incomplete uploads.

A third alternative is to use the V1 SDK, which gives you TransferManager. That handles the multi-part upload for you, and uses multiple threads to improve bandwidth. But it still hasn't been implemented for V2.

Download file from URL and upload it to AWS S3 without saving into memory using AWS SDK for Java, version 2

Answers (2)

Related Questions