saurabh agarwal
saurabh agarwal

Reputation: 2184

Deleting S3 files using AWS data pipeline

I want to delete all S3 keys starting with some prefix using AWS data Pipeline. I am using AWS Shell Activity for this.

These are the argument

  "scriptUri": "https://s3.amazonaws.com/my_s3_bucket/hive/removeExitingS3.sh",
  "scriptArgument": "s3://my_s3_bucket/output/2017-03-19",

I want to delete all S3 keys starting with 2017-03-19 in output folder. What should be command to do this?

I have tried this command in .sh file

  sudo yum -y upgrade aws-cli 
  aws s3 rm $1 --recursive

This is not working.

Sample files are

s3://my_s3_bucket/output/2017-03-19/1.txt
s3://my_s3_bucket/output/2017-03-19/2.txt
s3://my_s3_bucket/output/2017-03-19_3.txt

EDIT:

The date(2017-03-19) is dynamic and this is output of #{format(@scheduledStartTime,"YYYY-MM-dd")}. So effectively

 "scriptArgument": "s3://my_s3_bucket/output/{format(@scheduledStartTime,"YYYY-MM-dd")}"

Upvotes: 1

Views: 1099

Answers (1)

franklinsijo
franklinsijo

Reputation: 18300

Try

aws s3 rm $1 --recursive --exclude "*" --include "2017-03-19*" --include "2017-03-19/*"

with

"scriptArgument": "s3://my_s3_bucket/output/"

EDIT: As the date is a dynamic param, pass it as the second scriptArgument to the Shell command activity,

aws s3 rm $1 --recursive --exclude "*" --include "$2*" --include "$2/*"

Upvotes: 1

Related Questions