lolski
lolski

Reputation: 17501

AWS S3: Uploading large file fails with ResetException: Failed to reset the request input stream

Can anyone tell me what is wrong with the following code such that a large file upload (>10GB) always fails with ResetException: Failed to reset the request input stream?

The failure always happens after a while (i.e. after around 15 minutes), which must mean that the upload process is executing only to fail somewhere in the middle.

Here's what I've tried to debug the problem:

  1. in.marksSupported() == false // checking if mark is supported on my FileInputStream

    I highly suspect that this is the problem, since the S3 SDK seems to want to do a reset operation at some point during the upload, probably if the connection is lost or if the transfer process encounters some error.

  2. Wrapping my FileInputStream within a BufferedInputStream to enable marking. Now calling in.marksSupported() returns true, meaning that mark support is there. Strangely, the upload process still fails with the same kind of error.

  3. Adding putRequest.getRequestClientOptions.setReadLimit(n), where n=100000 (100kb), and 800000000 (800mb) but it still throws the same error. I suspect because this parameter is used to reset the stream, which, as stated above, isn't supported on a FileInputStream

Interestingly, the same problem doesn't happen on my AWS development account. I assume that is just because the dev account is not under a heavy load as my production account, meaning that the upload process can execute as smoothly as possible without any failure at all.

Please have a look at my code below:

object S3TransferExample {
// in main class
def main(args: Array[String]): Unit = {
    ...
    val file = new File("/mnt/10gbfile.zip")
    val in = new FileInputStream(file)
    // val in = new BufferedInputStream(new FileInputStream(file)) --> tried wrapping file inputstream in a buffered input stream, but it didn't help..
    upload("mybucket", "mykey", in, file.length, "application/zip").waitForUploadResult
    ...
}

val awsCred = new BasicAWSCredentials("access_key", "secret_key")
val s3Client = new AmazonS3Client(awsCred)
val tx = new TransferManager(s3Client)

def upload(bucketName: String,  keyName: String,  inputStream: InputStream,  contentLength: Long,  contentType: String,  serverSideEncryption: Boolean = true,  storageClass: StorageClass = StorageClass.ReducedRedundancy ):Upload = {
  val metaData = new ObjectMetadata
  metaData.setContentType(contentType)
  metaData.setContentLength(contentLength)

  if(serverSideEncryption) {
    metaData.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION)
  }

  val putRequest = new PutObjectRequest(bucketName, keyName, inputStream, metaData)
  putRequest.setStorageClass(storageClass)
  putRequest.getRequestClientOptions.setReadLimit(100000)

  tx.upload(putRequest)
 
}
}

Here is the complete stack trace:

Unable to execute HTTP request: mybucket.s3.amazonaws.com failed to respond
org.apache.http.NoHttpResponseException: mybuckets3.amazonaws.com failed to respond
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:143) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.conn.DefaultHttpResponseParser.parseHead(DefaultHttpResponseParser.java:57) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:260) ~[httpcore-4.3.2.jar:4.3.2]
    at org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:283) ~[httpcore-4.3.2.jar:4.3.2]
    at org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:251) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.conn.ManagedClientConnectionImpl.receiveResponseHeader(ManagedClientConnectionImpl.java:197) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:271) ~[httpcore-4.3.2.jar:4.3.2]
    at com.amazonaws.http.protocol.SdkHttpRequestExecutor.doReceiveResponse(SdkHttpRequestExecutor.java:66) ~[aws-java-sdk-core-1.9.13.jar:na]
    at org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:123) ~[httpcore-4.3.2.jar:4.3.2]
    at org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:685) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:487) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.client.AbstractHttpClient.doExecute(AbstractHttpClient.java:863) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:82) ~[httpclient-4.3.4.jar:4.3.4]
    at org.apache.http.impl.client.CloseableHttpClient.execute(CloseableHttpClient.java:57) ~[httpclient-4.3.4.jar:4.3.4]
    at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:685) [aws-java-sdk-core-1.9.13.jar:na]
    at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460) [aws-java-sdk-core-1.9.13.jar:na]
    at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295) [aws-java-sdk-core-1.9.13.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3710) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:2799) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:2784) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:259) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:193) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:125) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:129) [aws-java-sdk-s3-1.9.13.jar:na]
    at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50) [aws-java-sdk-s3-1.9.13.jar:na]
    at java.util.concurrent.FutureTask.run(FutureTask.java:266) [na:1.8.0_40]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_40]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_40]
    at java.lang.Thread.run(Thread.java:745) [na:1.8.0_40]
com.amazonaws.ResetException: Failed to reset the request input stream;  If the request involves an input stream, the maximum stream buffer size can be configured via request.getRequestClientOptions().setReadLimit(int)
  at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:636)
  at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:460)
  at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:295)
  at com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3710)
  at com.amazonaws.services.s3.AmazonS3Client.doUploadPart(AmazonS3Client.java:2799)
  at com.amazonaws.services.s3.AmazonS3Client.uploadPart(AmazonS3Client.java:2784)
  at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadPartsInSeries(UploadCallable.java:259)
  at com.amazonaws.services.s3.transfer.internal.UploadCallable.uploadInParts(UploadCallable.java:193)
  at com.amazonaws.services.s3.transfer.internal.UploadCallable.call(UploadCallable.java:125)
  at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:129)
  at com.amazonaws.services.s3.transfer.internal.UploadMonitor.call(UploadMonitor.java:50)
  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
  at java.lang.Thread.run(Thread.java:745)
Caused by: java.io.IOException: Resetting to invalid mark
  at java.io.BufferedInputStream.reset(BufferedInputStream.java:448)
  at com.amazonaws.internal.SdkBufferedInputStream.reset(SdkBufferedInputStream.java:106)
  at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:103)
  at com.amazonaws.event.ProgressInputStream.reset(ProgressInputStream.java:139)
  at com.amazonaws.internal.SdkFilterInputStream.reset(SdkFilterInputStream.java:103)
  at com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:634) 

Upvotes: 8

Views: 22633

Answers (3)

osexp2000
osexp2000

Reputation: 3165

I'v investigated this issue, it was a long story.

The conclusion is: pass a system property to java by insert following options to java command line

-Dcom.amazonaws.sdk.s3.defaultStreamBufferSize=YOUR_MAX_PUT_SIZE

See https://github.com/aws/aws-sdk-java/blob/master/aws-java-sdk-s3/src/main/java/com/amazonaws/services/s3/AmazonS3Client.java#L1668

This tells AmazonS3Client to set appropriate max size of unwindable buffer which will be used to re-read for retry.

Upvotes: 5

lolski
lolski

Reputation: 17501

This definitely looks like a bug, which I have reported. The solution is to use the other constructor which accepts a File instead of InputStream

def upload(bucketName: String,  keyName: String,  file: File,  contentLength: Long,  contentType: String,  serverSideEncryption: Boolean = true,  storageClass: StorageClass = StorageClass.ReducedRedundancy ):Upload = {
  val metaData = new ObjectMetadata
  metaData.setContentType(contentType)
  metaData.setContentLength(contentLength)

  if(serverSideEncryption) {
    metaData.setSSEAlgorithm(ObjectMetadata.AES_256_SERVER_SIDE_ENCRYPTION)
  }

  val putRequest = new PutObjectRequest(bucketName, keyName, file)
  putRequest.setStorageClass(storageClass)
  putRequest.getRequestClientOptions.setReadLimit(100000)
  putRequest.setMetadata(metaData)
  tx.upload(putRequest)

}
}

Upvotes: 7

Michael - sqlbot
Michael - sqlbot

Reputation: 179124

S3 doesn't support a PUT request that large.

The largest object that can be uploaded in a single PUT is 5 gigabytes.

http://aws.amazon.com/s3/faqs

Beyond that, you have to use the multipart upload API, which allows each part to be 5GB and the maximum object size to be 5TB. You'd be well-served to use multipart for files smaller than 5GB, too, since multipart supports parallel uploading of the parts.

Upvotes: 1

Related Questions