Reputation: 6739
I was just given a large list of 500mb files in google drive. How would I go about queuing them for download on my local Linux machine?
I can't zip them all as a large download
I can't set them all to download at once
I can't be here all day to download them in small batches.
I see from documentation like wget/curl large file from google drive that the the api is depreciated from google drive and that we can't wget them.
So what I am looking for a way for sequentially download large files from a google drive, without having to manually click through the web browser to do so.
Upvotes: 3
Views: 5293
Reputation: 18963
tl;dr If you want to download a big file e.g. with id 0B4y35FiV1wh7X2pESGlLREpxdXM
:
// check out the filename
$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download&confirm=t' \
| htmlq -t '.uc-warning-subcaption'
mecab-jumandic-7.0-20130310.tar.gz (29M) is too large for Google to scan for viruses. Would you still like to download this file?
// download the file
$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download&confirm=t' \
-o mecab-jumandic-7.0-20130310.tar.gz
Let's say you want to download a file with id 0B4y35FiV1wh7MWVlSDBCSXZMTXM
. It's small enough and this will work:
$ curl 'https://drive.google.com/uc?export=download&id=0B4y35FiV1wh7MWVlSDBCSXZMTXM' \
-LOJ
-L
makes it follow redirects (there's one redirect here), -O
makes it output the response to a file and use the basename of the path from the url as the filename (uc
in this case), -J
makes -O
use the filename from the Content-Disposition
header returned by the server (if such header is returned).
But with 0B4y35FiV1wh7X2pESGlLREpxdXM
that won't work, because Google doesn't scan big files for viruses and wants you to confirm that you really want to download the file.
Let's do it step by step:
$ curl 'https://drive.google.com/uc?export=download&id=0B4y35FiV1wh7X2pESGlLREpxdXM' \
-sSw '%header{location}' -o /dev/null
https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download
-s
- be silent (don't show the progress meter or error messages), -S
- make -s
hide only the progress meter, -w '%header{location}'
- output the Location
header, -o /dev/null
- discard the response body.
$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download' \
-sSOJw '%header{content-disposition}'
The server didn't return the Content-Disposition
header, as such the response was saved in download
:
$ cat download | htmlq -t '.uc-warning-subcaption'
mecab-jumandic-7.0-20130310.tar.gz (29M) is too large for Google to scan for viruses. Would you still like to download this file?
htmlq
outputted text (-t
) of the tag with class uc-warning-subcaption
.
And there's a form there:
$ cat download | htmlq -p form
<form action="https://drive.usercontent.google.com/download" id="download-form" method="get">
<input class="goog-inline-block jfk-button jfk-button-action" id="uc-download-link" type="submit" value="Download anyway">
<input name="id" type="hidden" value="0B4y35FiV1wh7X2pESGlLREpxdXM">
<input name="export" type="hidden" value="download">
<input name="confirm" type="hidden" value="t">
<input name="uuid" type="hidden" value="158cc63f-3a8a-47b3-827f-c74bcaa04d16">
</form>
htmlq
outputted the form
tag and prettified (-p
) the output.
So it's like the last url + the confirm
and uuid
fields. And uuid
's change with each request. But it appears they are not necessary, it suffices to only add confirm=t
:
$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download&confirm=t' \
-o mecab-jumandic-7.0-20130310.tar.gz
Upvotes: 0
Reputation: 10135
Google Drive Sync helped me with this. Either you can install the sync and download or sync all files to local pc.
Or
Create a new drive and share to the files required to this drive, then sync this drive to local pc using google drive sync.
Upvotes: 0
Reputation: 609
curl -H "Authorization: Bearer XXXXX" https://www.googleapis.com/drive/v3/files/YYYYY?alt=media -o ZZZZZ
where
XXXXX===> Access Token
YYYYY===> File ID
ZZZZZ===> File Name that will be saved
5. Tested on Linux ( Should also work on Mac ). Windows can try with PowerShell.
Reference: Drive REST API Documentation.
Upvotes: 12
Reputation: 6739
I am going to explain how I solved my issue:
Use https://github.com/prasmussen/gdrive set it up as per the instructions on the page. It is not depreciated at this time, like my original misconception.
Right click on files. Get shareable link. (gdrive list
doesn't work in this case, as it is not in your drive, but someone else’s. Could look into copying it over to your repo, and getting the list from there.)
Paste all the links into a plain-text file, and then remove https://drive.google.com/open?id=
from the start of all the links.
Then you can sequentially download the list of google drive files with the following command:
while read p; do ./gdrive download $p; done <files.txt
Upvotes: 1