user3123372
user3123372

Reputation: 744

How do I delete all except the latest 5 recently updated/new files from AWS s3?

I can fetch the last five updated files from AWS S3 using the below command

aws s3 ls s3://somebucket/ --recursive | sort | tail -n 5 | awk '{print $4}'

Now I need to delete all the files in AWS S3 except the last 5 files which are fetched from above command in AWS.

Say the command fetches 1.txt,2.txt,3.txt,4.txt,5.txt. I need to delete all from AWS S3 except 1.txt,2.txt,3.txt,4.txt,and 5.txt.

Upvotes: 3

Views: 4408

Answers (3)

emilie zawadzki
emilie zawadzki

Reputation: 2127

Short story : Based on @bcattle answser, this work for AWS CLI 2:

aws s3 ls s3://[BUCKER_NAME] --recursive | awk 'NF>1{print $4}' | grep . | sort | head -n -5 | while read -r line ; do
    echo "Removing ${line}"
    aws s3 rm s3://[BUCKER_NAME]/${line}
done

Long story : aws s3 ls is returning under CLI 2 file path, but also date creation. This behaviour insn't expected in our script, as we want only the file path to be concatenated with bucket uri.

Upvotes: 1

bcattle
bcattle

Reputation: 12839

Use a negative number with head to get all but the last n lines:

aws s3 ls s3://somebucket/ --recursive | sort | head -n -5 | while read -r line ; do
    echo "Removing ${line}"
    aws s3 rm s3://somebucket/${line}
done

Upvotes: 7

helloV
helloV

Reputation: 52463

Use AWS s3 rm command with multiple --exclude options (I assume the last 5 files do not fall under a pattern)

aws s3 rm s3://somebucket/ --recursive --exclude "somebucket/1.txt" --exclude "somebucket/2.txt" --exclude "somebucket/3.txt" --exclude "somebucket/4.txt" --exclude "somebucket/5.txt"

CAUTION: Make sure you try it with --dryrun option, verify the files to be deleted do not include the 5 files before actually removing the files.

Upvotes: 6

Related Questions