Reputation: 186
I want to download large files from s3 to ec2 instances for file manipulation. What would be the fastest and most efficient way to do this?
Thanks in advance!
Upvotes: 3
Views: 11297
Reputation: 61
A tool like s5cmd is significantly faster for downloading objects than aws-cli (golang vs python makes a big difference). Their github README has some performance results that show ~10x speed difference.
It can be used like:
s5cmd cp s3://bucketname/object.txt localobject.txt
Upvotes: 1
Reputation: 101
One technique to achieve speed is to divide the problem into smaller problems, execute the smaller problems in parallel, and then assemble the results. In this case I believe that a utility can be written in which there are a number of workers each of which is responsible for downloading a portion of the file from S3 to EBS/EFS using ranged gets. The workers would all run in parallel. After all the pieces have been downloaded then they can be combined into a single file.
Upvotes: 1
Reputation: 269340
Use the AWS Command-Line Interface (CLI).
It has an aws s3 cp
command to download files, and an aws s3 sync
command to synchronize the content between a local directory and S3 (or vice versa).
Upvotes: 7