Harish Singh
Harish Singh

Reputation: 51

what is the fastest and cost efficient way of moving objects from one s3 folder to another folder in same bucket

I have an example bucket with demo folder containing sub-folders and files in it. BucketStructure:

example/demo/*.jpeg #.jpeg files 
example/demo/sub-folder1
example/demo/sub-folder2

My objective is to move all the .jpeg files from demo/ folder excluding .jpeg files from sub-folder1 & sub-folder2 to a new folder /example/archive-jpeg/.

Seeking help to get fastest and cost efficient way using aws-cli: using awscli v2

  1. list all .jpeg objects in immediate /demo folder and
  2. moving them to /archive-jpeg folder
  3. deleting the .jpeg from /demo folder after archival

Thank You!

Upvotes: 2

Views: 3668

Answers (1)

John Rotenstein
John Rotenstein

Reputation: 269500

This should do it:

aws s3 mv s3://bucket/demo/  s3://bucket/target-folder/ --recursive --exclude "*" --include "*.jpeg" --exclude "*/*"

The logic is:

  • aws s3 mv --recursive tells it to move all objects
  • --exclude "*" tells it to exclude all objects from being moved
  • --include "*.jpeg" tells it to include objects ending with .jpeg
  • --exclude "*/*" tells it to exclude anything in sub-directories (eg sub-folder1/ and sub-folder2/)

See: AWS CLI: Use of Exclude and Include Filters

As for fastest and cost efficient, since you have specified that you want to do it via the AWS CLI, then there are no other options.

If you are willing to do it without the AWS CLI, then a faster way to move the objects would be to write some code that runs in parallel to send individual Copy and Delete API calls to Amazon S3. (There is no 'move' command in S3 — the AWS CLI is actually copying the objects and then deleting the original objects.)

Or, rather than moving objects as a batch, you could configure an Amazon S3 Event to trigger an AWS Lambda function that moves the files as soon as they are created, which would result in them moving 'faster' than running as a batch.

As for cost efficient, the objects need to be copied and deleted, which would result in API calls to Amazon S3 at a cost of $0.005 per 1000 requests. I don't think you could avoid these API calls, so there would be no way to make it even lower cost.

Upvotes: 5

Related Questions