Reputation: 471
I need to delete files in Amazon S3 which are older than seven days. I needed a shell script to do this, but I didn't have any luck with google search. I found the below URL:
http://shout.setfive.com/2011/12/05/deleting-files-older-than-specified-time-with-s3cmd-and-bash/
It is not helpful to us. What would be a script to delete all files older than seven days?
Upvotes: 16
Views: 77156
Reputation: 8325
Easiest way as of Oct 2023
I had similar situation to deal in a multi-tenant setup where a single user could create N buckets each with different retention period in days. Since I had all the bucket related config available at db level so simple use of s3cmd expire command did the job.
Set expiry:
s3cmd expire s3://BUCKET_PATH --expiry-days 7 --access_key=ACCESS_KEY --secret_key=ACCESS_KEY
List life cycle:
s3cmd getlifecycle s3://BUCKET_PATH --access_key=ACCESS_KEY --secret_key=SECRET_KEY
Upvotes: 3
Reputation: 797
I have created the below script and had it running with cron. As per my requirement, the script is able to delete one backup file daily which is eight days older and able to keep only seven days which is a backup file and generates here one file daily.
#!/bin/bash
#Purpose: functional for 7 days backup retention policy
count=$(/usr/bin/sudo /usr/local/bin/aws s3 ls bucketname |nl|tail -n1|awk '{print $1}')
if [[ "$count" == 8 ]]
then
filename=$(/usr/bin/sudo /usr/local/bin/aws s3 ls bucketname|awk '{print $NF}'|head -n1)
/usr/bin/sudo /usr/local/bin/aws s3 rm s3://bucketname/$filename
fi
Upvotes: 0
Reputation: 21201
I was looking for an s3cmd command to delete files older than N days, and here is what worked for me
s3cmd ls s3://your-address-here/ | awk -v dys="2" 'BEGIN { depoch=(dys*86400);cepoch=(systime()-depoch) } { gsub("-"," ",$1);gsub(":"," ",$2 );if (mktime($1" "$2" 00")<=cepoch) { print "s3cmd del "$4 } }' | bash
Upvotes: 4
Reputation: 631
I slightly modified the one from Prabhu R for being able to execute the shell script on Mac OS X (I tested with Mac OS X v10.13 (High Sierra)):
BUCKETNAME=s3://BucketName/WithOrWithoutDirectoryPath/
aws s3 ls $BUCKETNAME | while read -r line;
do
createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`gdate -d"$createDate" +%s`
olderThan=`gdate '+%s' -d '1 week ago'`
if [[ $createDate -lt $olderThan ]]
then
fileName=`echo $line|awk {'print $4'}`
if [[ $fileName != "" ]]
then
echo "deleting " $BUCKETNAME$fileName
aws s3 rm $BUCKETNAME$fileName
fi
fi
done;
Upvotes: 0
Reputation: 1
Here is a simple script that I wrote for my environment.
And, the files in my s3 bucket are in FULL_BACKUP_2020-06-25.tar.gz
format.
#!/bin/bash
#Defining variables.
#Date=`date +%Y-%m-%d`
ThreeDaysOldDate=`date -d '-3 days' +%Y-%m-%d | tr -d '-'`
Obj=`/usr/local/bin/aws s3 ls s3://bucket_name/folder/ | sed -n '2,$'p | awk '{print $4}'| cut -b 13-22 | tr -d '-'`
#Comparing files older than past 3 days and removing them from S3.
for i in $Obj
do
if [ $i -lt $ThreeDaysOldDate ]; then
var1="FULL_BACKUP_"
var2=".tar.gz"
year=$(echo $i | cut -c 1-4)
mon=$(echo $i | cut -c 5-6)
day=$(echo $i | cut -c 7-8)
DATE=$var1$year-$mon-$day$var2
/usr/local/bin/aws s3 rm s3://bucket_name/folder/$DATE > /dev/null 2>&1
fi
done
Upvotes: 0
Reputation: 471
We have modified the code a little bit and it is working fine.
aws s3 ls BUCKETNAME/ | while read -r line;
do
createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`date -d"$createDate" +%s`
olderThan=`date --date "7 days ago" +%s`
if [[ $createDate -lt $olderThan ]]
then
fileName=`echo $line|awk {'print $4'}`
if [[ $fileName != "" ]]
then
aws s3 rm BUCKETNAME/$fileName
fi
fi
done;
Upvotes: 18
Reputation: 1
Based on the solution suggested by @Prabhu R, I patched the code and added variables.
So if you save the bellow to cleanup.sh, you can run:
./cleanup.sh <bucket_name> <days_beyond_you_want_files_removed | number>
#!/bin/bash
aws s3 ls $1/ --recursive | while read -r line;
do
createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`date -d"$createDate" +%s`
olderThan=`date --date "$2 days ago" +%s`
if [[ $createDate -lt $olderThan ]]
then
fileName=`echo $line|awk {'print $4'}`
if [[ $fileName != "" ]]
then
aws s3 rm s3://$1/$fileName
fi
fi
done;
Upvotes: 0
Reputation: 780
This will delete 159 days aged files recursively from the S3 bucket. You can change the days as per your requirement. which includes filenames with spaces. The above scripts didn't work with filenames with spaces.
Note: Existing directory structure may get deleted. If you don't prefer directory structure you can use this.
If you would prefer directory structure give the full path of last child directory and modify this on each execution to secure parent directory structure.
example:
s3://BucketName/dir1/dir2/dir3/
s3://BucketName/dir1/dir2/dir4/
s3://BucketName/dir1/dir2/dir5/
vim s3_file_delete.sh
s3bucket="s3://BucketName"
s3dirpath="s3://BucketName/WithOrWithoutDirectoryPath/"
aws s3 ls $s3dirpath --recursive | while read -r line;
do
createDate=`echo $line|awk {'print $1" "$2'}`
createDate=`date -d"$createDate" +%s`
olderThan=`date --date "159 days ago" +%s`
if [[ $createDate -lt $olderThan ]]
then
fileName=`echo $line|awk '{a="";for (i=4;i<=NF;i++){a=a" "$i}print a}' |awk '{ sub(/^[ \t]+/, ""); print }'`
if [[ $fileName != "" ]]
then
#echo "$s3bucket/$fileName"
aws s3 rm "$s3bucket/$fileName"
fi
fi
done;
Upvotes: 0
Reputation: 269101
The easiest method is to define Object Lifecycle Management on the Amazon S3 bucket.
You can specify that objects older than a certain number of days should be expired (deleted). The best part is that this happens automatically on a regular basis and you don't need to run your own script.
If you wanted to do it yourself, the best would be to write a script (eg in Python) to retrieve the list of files and delete ones older than a certain date.
It's somewhat messier to do as a shell script.
Upvotes: 42