Jay
Jay

Reputation: 770

AWS CLI Download list of S3 files

We have ~400,000 files on a private S3 bucket that are inbound/outbound call recordings. The files have a certain pattern to it that lets me search for numbers both inbound and outbound. Note these calls are on the Glacier storage class

Using AWS CLI, I can search through this bucket and grep the files I need out. What I'd like to do is now initiate an S3 restore job to expedited retrieval (so ~1-5 minute recovery time), and then maybe 30 minutes later run a command to download the files.

My efforts so far:

aws s3 ls s3://exetel-logs/ --recursive | grep .*042222222.* | cut -c 32-

Retreives the key of about 200 files. I am unsure of how to proceed next, as aws s3 cp wont work for any objects in storage class.

Cheers,

Upvotes: 0

Views: 1717

Answers (2)

Dunedan
Dunedan

Reputation: 8435

The AWS CLI has two separate commands for S3: s3 ands3api. s3 is a high level abstraction with limited features, so for restoring files, you'll have to use one of the commands available with s3api:

aws s3api restore-object --bucket exetel-logs --key your-key

If you afterwards want to copy the files, but want to ensure to only copy files which were restored from Glacier, you can use the following code snippet:

for key in $(aws s3api list-objects-v2 --bucket exetel-logs --query "Contents[?StorageClass=='GLACIER'].[Key]" --output text); do
  if [ $(aws s3api head-object --bucket exetel-logs --key ${key} --query "contains(Restore, 'ongoing-request=\"false\"')") == true ]; then
    echo ${key}
  fi
done

Upvotes: 2

rvd
rvd

Reputation: 568

Have you considered using a high-level language wrapper for the AWS CLI? It will make these kinds of tasks easier to integrate into your workflows. I prefer the Python implementation (Boto 3). Here is example code for how to download all files from an S3 bucket.

Upvotes: 0

Related Questions