Reputation: 9141
From the doc : "When you download a file using TransferManager, the utility automatically determines if the object is multipart"
source : https://aws.amazon.com/fr/blogs/developer/parallelizing-large-downloads-for-optimal-speed/
It means there are indicators somewhere (metadata ? properties ?) which can tell you if a file is "multipart". So I'm testing AWS Rest APIs with AWS CLI before testing with java SDK, and i'm focusing on multipart uploads/downloads (according to the doc, a download will be multipart only if the upload was multipart).
First I set the threshold explicitely to 5MB :
$ aws configure set default.s3.multipart_threshold 5MB
And I upload a 20 MB file :
$ aws s3 cp ./my-file s3://my-bucket/test/multipart-upload-1
It takes 45s, and when I check during upload with :
$ aws s3api list-multipart-uploads --bucket my-bucket
I can see my upload is part of the list, but I see only one download and no information about the number of "parts" or connections.
If I set the threshold to 50MB (far over the file size), the upload is much faster (over in 10s) and during the upload I can't see the upload using :
$ aws s3api list-multipart-uploads --bucket my-bucket
So it tends to show me the first upload was recognized as a "multipart" upload, but I have no informations about the number of parts and after the upload I can't distinguish between multipart uploaded files and simply uploaded file.
Upvotes: 4
Views: 3048
Reputation: 22332
To know if an object is multipart or not, you can check the ETag
.
For non-multipart object, Etag looks something like 0a3dbf3a768081d785c20b498b4abd24
For multipart ones, Etag looks like ceb8853ddc5086cc4ab9e149f8f09c88-2
You can differentiate them with the -
character.
With AWS CLI, you can recover the Etag of an object with this command:
aws s3api head-object --bucket <bucket> --key <object_key> | grep ET
ag
With boto3, you can recover Etag like that:
from boto3 import client
s3 = client('s3')
print(s3.head_object(Bucket=<bucket>, Key=<object_key>)['ETag'])
Upvotes: 1
Reputation: 107
You can most easily tell if an object is multipart or not by looking at the ETAG. If the ETAG is longer than 32 characters, and contains a -# at the end, then you know that it's a multipart request. The # at the end of the ETag denotes the number of parts in the object.
I'm not sure if this is documented anywhere specifically, however it's been successfully decomposed in other Stack Overflow questions using this methodology:
What is the algorithm to compute the Amazon-S3 Etag for a file larger than 5GB?
Upvotes: 1