Reputation: 39
I want to transfer files from one s3 bucket path (say B1/x/*
) to another S3 bucket (say B2/y/*
), where B1 and B2 are two s3 buckets and x and y are folders in them which contain csv files respectively.
I have written below script to do this. But I am getting error `object_list' is not defined. Moreover, I am not sure whether it will perform the job of transferring files or not.
Refer the script below:
import boto3
s3 = boto3.client("s3")
# list_objects_v2() give more info
more_objects=True
found_token = True
while more_objects :
if found_token :
response= s3.list_objects_v2(
Bucket="B1",
Prefix="x/",
Delimiter="/")
else:
response= s3.list_objects_v2(
Bucket="B1",
ContinuationToken=found_token,
Prefix="x/",
Delimiter="/")
# use copy_object or copy_from
for source in object_list["Contents"]:
raw_name = source["Key"].split("/")[-1]
new_name = "new_structure/{}".format(raw_name)
s3.copy_from(CopySource='B1/x')
# Now check there is more objects to list
if "NextContinuationToken" in response:
found_token = response["NextContinuationToken"]
more_objects = True
else:
more_objects = False
It would be really helpful if anyone could help me in making changes in the above script.
Thanks
Upvotes: 1
Views: 6801
Reputation: 1136
If your script running in local server and want to access two buckets for transferring files from one s3 bucket to another, you can follow below code .This create a copy of files in "bucket1" to "sample" folder in "bucket2".
import boto3
s3 = boto3.resource('s3')
src_bucket = s3.Bucket('bucket1')
dest_bucket = s3.Bucket('bucket2')
for obj in src_bucket.objects.all():
filename= obj.key.split('/')[-1]
dest_bucket.put_object(Key='sample/' + filename, Body=obj.get()["Body"].read())
I you want to remove files after copying from source bucket,below code can use within the loop after copying.
s3.Object(src_bucket, obj.key).delete()
Upvotes: 0
Reputation: 49
You can use below code to transfer files from one bucket to another in a layered folder structure like yours. Here you won't have to define any specific key or folder structure, the code takes care of that:
import boto3
s3 = boto3.resource('s3')
src_bucket = s3.Bucket('bucket_name')
dest_bucket = s3.Bucket('bucket_name')
dest_bucket.objects.all().delete() #this is optional clean bucket
for obj in src_bucket.objects.all():
s3.Object('dest_bucket', obj.key).put(Body=obj.get()["Body"].read())
If you want to clear your source bucket once the files are moved, you can
use src_bucket.objects.all().delete()
at the end of your code to clean the
source bucket.
Upvotes: 1