Siddhartha Sarnobat
Siddhartha Sarnobat

Reputation: 29

I want to use S3P to copy 100TB from approx 7 Buckets

Some of my objects in my S3 buckets are encrypted with another KMS key.

Which I don't have access to in that case I want to exclude these files.

I have tried using S3 sync and it has parameter --exclude switch which will exclude those files so S3 sync works but the data size is around 100 TB that needs to be completed within 2 days

I want to know if that option is present in S3P as well.

https://www.genui.com/open-source/s3p-massively-parallel-s3-copying

I tried using S3 sync and it works.

aws s3 sync s3://bucket s3://mybucket --exclude "folder/*".

Upvotes: 0

Views: 442

Answers (2)

Shane Brinkman-Davis
Shane Brinkman-Davis

Reputation: 699

S3P-author here. S3P has a number of ways of selecting what files to process. You can see all the options with npx s3p cp --help. In particular, I'd suggest either:

  1. Run S3P and filter out the unwanted keys: --filter "js:({Key}) => !/^folder\//.test(Key)"
  2. If there are a LOT of those unwanted keys, running S3P two times: once with --stop-at "folder/" and once with `--start-after "folder/~" might be faster (nothing special about "~" - it's just the last supported character in the character range)

Upvotes: 0

John Rotenstein
John Rotenstein

Reputation: 270089

An alternate approach is to use Amazon S3 Batch Operations to transfer the files.

It requires an input Manifest file that lists the objects to copy. If it is a large list of objects, you can generate the manifest file through AWS Inventory and then remove the directories/files that you don't want to copy.

Then, create an S3 Batch Operations job to copy the listed objects.

See: Copying objects using S3 Batch Operations - Amazon Simple Storage Service

Upvotes: 1

Related Questions