Reputation: 3278
I am trying to move files older than a hour from one s3 bucket to another s3 bucket using python boto3 AWS lambda function with following cases:
I got some help to move files using the python code mentioned by @John Rotenstein
import boto3
from datetime import datetime, timedelta
SOURCE_BUCKET = 'bucket-a'
DESTINATION_BUCKET = 'bucket-b'
s3_client = boto3.client('s3')
# Create a reusable Paginator
paginator = s3_client.get_paginator('list_objects_v2')
# Create a PageIterator from the Paginator
page_iterator = paginator.paginate(Bucket=SOURCE_BUCKET)
# Loop through each object, looking for ones older than a given time period
for page in page_iterator:
for object in page['Contents']:
if object['LastModified'] < datetime.now().astimezone() - timedelta(hours=1): # <-- Change time period here
print(f"Moving {object['Key']}")
# Copy object
s3_client.copy_object(
Bucket=DESTINATION_BUCKET,
Key=object['Key'],
CopySource={'Bucket':SOURCE_BUCKET, 'Key':object['Key']}
)
# Delete original object
s3_client.delete_object(Bucket=SOURCE_BUCKET, Key=object['Key'])
How can this be modified to cater the requirement
Upvotes: 3
Views: 16083
Reputation: 269340
An alternate approach would be to use Amazon S3 Replication, which can replicate bucket contents:
Replication is frequently used when organizations need another copy of their data in a different region, or simply for backup purposes. For example, critical company information can be replicated to another AWS Account that is not accessible to normal users. This way, if some data was deleted, there is another copy of it elsewhere.
Replication requires versioning to be activated on both the source and destination buckets. If you require encryption, use standard Amazon S3 encryption options. The data will also be encrypted during transit.
You configure a source bucket and a destination bucket, then specify which objects to replicate by providing a prefix or a tag. Objects will only be replicated once Replication is activated. Existing objects will not be copied. Deletion is intentionally not replicated to avoid malicious actions. See: What Does Amazon S3 Replicate?
There is no "additional" cost for S3 replication, but you will still be charge for any Data Transfer charges when moving objects between regions, and for API Requests (that are tiny charges), plus storage of course.
Upvotes: 3
Reputation: 269340
This is a non-issue. You can just copy the object between buckets and Amazon S3 will figure it out.
This is a bit harder because the code will use a single set of credentials must have ListBucket
and GetObject
access on the source bucket, plus PutObject
rights to the destination bucket.
Also, if credentials are being used from the Source account, then the copy must be performed with ACL='bucket-owner-full-control'
otherwise the Destination account won't have access rights to the object. This is not required when the copy is being performed with credentials from the Destination account.
Let's say that the Lambda code is running in Account-A
and is copying an object to Account-B
. An IAM Role (Role-A
) is assigned to the Lambda function. It's pretty easy to give Role-A
access to the buckets in Account-A
. However, the Lambda function will need permissions to PutObject
in the bucket (Bucket-B
) in Account-B
. Therefore, you'll need to add a bucket policy to Bucket-B
that allows Role-A
to PutObject
into the bucket. This way, Role-A
has permission to read from Bucket-A
and write to Bucket-B
.
So, putting it all together:
Role-A
) for the Lambda functionRole-A
)copy_object()
command, include ACL='bucket-owner-full-control'
(this is the only coding change needed)Upvotes: 3