user6391187
user6391187

Reputation: 87

Copy list of files from S3 bucket to S3 bucket

Is there a way I could copy a list of files from one S3 bucket to another? Both S3 buckets are in the same AWS account. I am able to copy a single file at a time using the aws cli command:

     aws s3 cp s3://source-bucket/file.txt s3://target-bucket/file.txt

However I have 1000+ files to copy. I do not want to copy all files in the source bucket so I am not able to utilize the sync command. Is there a way to call a file with the list of file names that needs to be copied to automate this process?

Upvotes: 0

Views: 6487

Answers (3)

Avi
Avi

Reputation: 454

If you want to use the AWS CLI, you could use cp in a loop over a file containing the names of the files you want to copy:

while read FNAME
do
  aws s3 cp s3://source-bucket/$FNAME s3://target-bucket/$FNAME
done < file_list.csv

I've done this for a few hundred files. It's not efficient because you have to make a request for each file.

A better way would be to use the --include argument multiple times in one cp line. If you could generate all those arguments in the shell from a list of files you would effectively have

aws s3 cp s3://source-bucket/ s3://target-bucket/ --exclude "*" --include "somefile.txt" --include "someotherfile.jpg" --include "another.json" ...

I'll let someone more skilled figure out how to script that.

Upvotes: 0

cookiedough
cookiedough

Reputation: 3840

Approaching this problem from the Python aspect, you can run a Python script that does it for you. Since you have a lot of files, it might take a while but should get the job done. Save the following code in a file with .py extension and run it. You might need to run pip install boto3 beforehand in your terminal in case you don't already have it.

import boto3
s3 = boto3.resource('s3')
mybucket = s3.Bucket('oldBucket')
list_of_files = ['file1.txt', 'file2.txt']
for obj in mybucket.objects.all():
    if obj.key in list_of_files:
        s3.Object('newBucket', obj.key).put(Body=obj.get()["Body"].read())

Upvotes: 1

Arafat Nalkhande
Arafat Nalkhande

Reputation: 11708

You can use the --exclude and --include filters and as well use the --recursive flag in s3 cp command to copy multiple files

Following is an example

aws s3 cp /tmp/foo/ s3://bucket/ --recursive --exclude "*" --include "*.jpg"

For more details click here

Upvotes: 3

Related Questions