Reputation: 904
Is it possible to directly upload data files from a website to an AWS S3 bucket without having to first download the files to my PC? These are large files (60 GB) and will tie up my PC for a week downloading, prepping, and then uploading the tables to my S3 bucket.
I see in my S3 Management Console there are options to "Add files" and "Add folder" but I don't see a place where I can enter the original URL location of the data files.
The data files are located here (scroll to bottom of page); URLs for the data are:
https://acic2022.mathematica.org/data/track1a_20220404.zip
https://acic2022.mathematica.org/data/track1b_20220404.zip
https://acic2022.mathematica.org/data/track1c_20220404.zip
Thanks!
Upvotes: 0
Views: 2058
Reputation: 41
So, the below script worked well for small files, in other words, you have to wait until the download ends the process before it initializes the upload to bucket s3, which might take a long while. During that time you'll only see the bucket created.
The bucket will be created for the default zone: us-east-1. A programmatic user is mandatory with admin permissions, once you will need to deal with AWS CLI through an automated process.
You'll only need to change the variable FILE that has to be the last node of your source path.
FILE="track1a_20220404.zip"
BCKT="math-files-s3"
SRC="https://acic2022.mathematica.org/data/$FILE"
aws s3api create-bucket --bucket $BCKT | \
wget -O- $SRC | aws s3 cp - s3://$BCKT/$FILE
Just editing to provide screenshots as proof of work...
Upvotes: 1