Reputation: 327
I'm trying to download GhTorrent dump from http://ghtorrent-downloads.ewi.tudelft.nl/mysql/mysql-2020-07-17.tar.gz which is about 127gb
I tried in the cloud but after 6gb it stops, I believe that there is a size limit for using curl
curl http://ghtorrent... | gsutil cp - gs://MY_BUCKET_NAME/mysql-2020-07-17.tar.gz
I cannot use Data Transfer as I need to specify the url, size in bytes (which I have) and hash MD5 which I don't have and I only can generate by having the file in my disk. I think(?)
Is there any other option to download and upload the file directly to the cloud? My total disk size is 117gb sad beep
Upvotes: 1
Views: 1165
Reputation: 6350
Worked for me with Storage Transfer Service: https://console.cloud.google.com/transfer/
Have a look on the pricing before moving TBs especially if your target is nearline/coldline: https://cloud.google.com/storage-transfer/pricing
Simple example that copies a file from a public url, to my bucket using a Transfer Job
:
TsvHttpData-1.0
http://public-url-pointint-to-the-file
.tsv
file on my bucket https://storage.googleapis.com/<my-bucket-name>/theTsv.tsv
url
that points to the theTsv.tsv file in the URL of TSV file
field;My file, named MD5SUB
was copied from the source url
into my bucket, under an identical directory structure.
Upvotes: 3