user3606138
user3606138

Reputation: 75

Daily S3 sync between two buckets based on date

I'm doing an S3 sync from source to dest and i only want to sync a particular folder each day based on date. Currently the source S3 bucket is in the format S3://bucket/year/month/day/min . I ran the below S3 sync command for a first time load which took 4 hours:

aws s3 sync s3://source-bucket/ s3://destination-bucket 

However i want to do something like this to save time:

aws s3 sync s3://source-bucket/year/month/day s3://destination-bucket/year/month/day

Question is --> Is there a way to pass parameters to each of the year, month, day part so that it is automated? e.g if i run the script today it should run:

aws s3 sync s3://source-bucket/2019/03/11 s3://destination-bucket/2019/03/11

My shell script game isnt that strong, so trying to see if there is a good way to do this.

Upvotes: 0

Views: 2980

Answers (2)

John Rotenstein
John Rotenstein

Reputation: 270144

One option is to extract the path of the "latest file" from the source bucket, and use that to copy to the destination.

This command will provide the Key of the file that was last modified:

aws s3api list-objects-v2 --bucket my-bucket --query 'sort_by(Contents, &LastModified)[-1].Key' --output text

You could then manipulate the return value by removing the filename and use the remaining path in the aws s3 sync command.

Upvotes: 1

user3606138
user3606138

Reputation: 75

Figured this one out. This is what i wrote:

 export current_date=$(date +%Y%m%d)
 export previous_date=
 export Year=$(date +%Y)
 export Month=$(date +%m)
 export day=$(date --date='1 day ago' '+%d') 
#Taking yesterday's date
 export SOURCE_S3='s3://Source/'$Year/$Month/$day/
 export DESTINATION_S3='s3://DESTINATION/'$Year/$Month/$day/

echo 'aws s3 sync' $SOURCE_S3 $DESTINATION_S3
aws s3 sync $SOURCE_S3  $DESTINATION_S3

Upvotes: 1

Related Questions