Reputation: 785
I am having trouble downloading multiple files from AWS S3 buckets to my local machine.
I have all the filenames that I want to download and I do not want others. How can I do that ? Is there any kind of loop in aws-cli I can do some iteration ?
There are couple hundreds files I need to download so that it seems not possible to use one single command that takes all filenames as arguments.
Upvotes: 66
Views: 130964
Reputation: 199
@Rajan's answer is a very good one, however it fails when there is no match found in the *.txt file and the source s3 bucket, however below code resolves also this issue:
#!/bin/bash
while IFS= read -r line; do
aws s3 cp s3://your-s3-source-bucket/folder/$line s3://your-s3-destination/folder/
done <try.txt
The only thing you need is to run the bash file inside you aws notebook.
!chmod +x YOUR-BASH-NAME.sh
!./YOUR-BASH-NAME.sh
Upvotes: 1
Reputation: 477
I wanted to read s3 object keys from a text file and download them to my machine parallelly.
I used this command
cat <filename>.txt | parallel aws s3 cp {} <output_dir>
The contents of my text file looked like this:
s3://bucket-name/file1.wav
s3://bucket-name/file2.wav
s3://bucket-name/file3.wav
Please make sure you don't have an empty line at the end of your text file. You can learn more about GNU parallel here
Upvotes: 5
Reputation: 3638
Also one can use the --recursive
option, as described in the documentation for cp
command. It will copy all objects under a specified prefix recursively.
Example:
aws s3 cp s3://folder1/folder2/folder3 . --recursive
will grab all files under folder1/folder2/folder3 and copy them to local directory.
Upvotes: 83
Reputation: 8622
Tried all the above. Not much joy. Finally, adapted @rajan's reply into a one-liner:
for file in whatever*.txt; do { aws s3 cp $file s3://somewhere/in/my/bucket/; } done
Upvotes: 7
Reputation: 497
As per the doc you can use include
and exclude
filters with s3 cp
as well. So you can do something like this:
aws s3 cp s3://bucket/folder/ . --recursive --exclude="*" --include="2017-12-20*"
Make sure you get the order of exclude
and include
filters right as that could change the whole meaning.
Upvotes: 35
Reputation: 420
There is a bash script which can read all the filenames from a file filename.txt
.
#!/bin/bash
set -e
while read line
do
aws s3 cp s3://bucket-name/$line dest-path/
done <filename.txt
Upvotes: 35
Reputation: 3527
You might want to use "sync" instead of "cp". The following will download/sync only the files with the ".txt" extension in your local folder:
aws s3 sync --exclude="*" --include="*.txt" s3://mybucket/mysubbucket .
Upvotes: 41
Reputation: 785
I got the problem solved, may be a little bit stupid, but it works.
Using python, I write multiple line of AWS download commands on one single .sh file, then I execute it on the terminal.
Upvotes: -4