Bryce Drew
Bryce Drew

Reputation: 6739

Download Large Google-Drive Files

I was just given a large list of 500mb files in google drive. How would I go about queuing them for download on my local Linux machine?

I can't zip them all as a large download

I can't set them all to download at once

I can't be here all day to download them in small batches.

I see from documentation like wget/curl large file from google drive that the the api is depreciated from google drive and that we can't wget them.

So what I am looking for a way for sequentially download large files from a google drive, without having to manually click through the web browser to do so.

Upvotes: 3

Views: 5293

Answers (4)

x-yuri
x-yuri

Reputation: 18963

tl;dr If you want to download a big file e.g. with id 0B4y35FiV1wh7X2pESGlLREpxdXM:

// check out the filename
$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download&confirm=t' \
    | htmlq -t '.uc-warning-subcaption'
mecab-jumandic-7.0-20130310.tar.gz (29M) is too large for Google to scan for viruses. Would you still like to download this file?

// download the file
$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download&confirm=t' \
    -o mecab-jumandic-7.0-20130310.tar.gz

Let's say you want to download a file with id 0B4y35FiV1wh7MWVlSDBCSXZMTXM. It's small enough and this will work:

$ curl 'https://drive.google.com/uc?export=download&id=0B4y35FiV1wh7MWVlSDBCSXZMTXM' \
    -LOJ

-L makes it follow redirects (there's one redirect here), -O makes it output the response to a file and use the basename of the path from the url as the filename (uc in this case), -J makes -O use the filename from the Content-Disposition header returned by the server (if such header is returned).

But with 0B4y35FiV1wh7X2pESGlLREpxdXM that won't work, because Google doesn't scan big files for viruses and wants you to confirm that you really want to download the file.

Let's do it step by step:

$ curl 'https://drive.google.com/uc?export=download&id=0B4y35FiV1wh7X2pESGlLREpxdXM' \
    -sSw '%header{location}' -o /dev/null
https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download

-s - be silent (don't show the progress meter or error messages), -S - make -s hide only the progress meter, -w '%header{location}' - output the Location header, -o /dev/null - discard the response body.

$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download' \
    -sSOJw '%header{content-disposition}'

The server didn't return the Content-Disposition header, as such the response was saved in download:

$ cat download | htmlq -t '.uc-warning-subcaption'
mecab-jumandic-7.0-20130310.tar.gz (29M) is too large for Google to scan for viruses. Would you still like to download this file?

htmlq outputted text (-t) of the tag with class uc-warning-subcaption.

And there's a form there:

$ cat download | htmlq -p form
<form action="https://drive.usercontent.google.com/download" id="download-form" method="get">
    <input class="goog-inline-block jfk-button jfk-button-action" id="uc-download-link" type="submit" value="Download anyway">
    <input name="id" type="hidden" value="0B4y35FiV1wh7X2pESGlLREpxdXM">
    <input name="export" type="hidden" value="download">
    <input name="confirm" type="hidden" value="t">
    <input name="uuid" type="hidden" value="158cc63f-3a8a-47b3-827f-c74bcaa04d16">
</form>

htmlq outputted the form tag and prettified (-p) the output.

So it's like the last url + the confirm and uuid fields. And uuid's change with each request. But it appears they are not necessary, it suffices to only add confirm=t:

$ curl 'https://drive.usercontent.google.com/download?id=0B4y35FiV1wh7X2pESGlLREpxdXM&export=download&confirm=t' \
    -o mecab-jumandic-7.0-20130310.tar.gz

Upvotes: 0

Arun Prasad E S
Arun Prasad E S

Reputation: 10135

Google Drive Sync helped me with this. Either you can install the sync and download or sync all files to local pc.

Or

Create a new drive and share to the files required to this drive, then sync this drive to local pc using google drive sync.

Upvotes: 0

Joseph Varghese
Joseph Varghese

Reputation: 609

  1. Go to OAuth 2.0 Playground
  2. In Select & authorize APIs(Step 1), Select “Drive API v3” and then select "https://www.googleapis.com/auth/drive.readonly" as the scope.
  3. Authorize the API using Google ID and exchange authorization code for tokens(Step 2). We will get the access token and refresh token
  4. Open terminal and run

curl -H "Authorization: Bearer XXXXX" https://www.googleapis.com/drive/v3/files/YYYYY?alt=media -o ZZZZZ

where

XXXXX===> Access Token

YYYYY===> File ID

ZZZZZ===> File Name that will be saved

5. Tested on Linux ( Should also work on Mac ). Windows can try with PowerShell.

Reference: Drive REST API Documentation.

Upvotes: 12

Bryce Drew
Bryce Drew

Reputation: 6739

I am going to explain how I solved my issue:

Use https://github.com/prasmussen/gdrive set it up as per the instructions on the page. It is not depreciated at this time, like my original misconception.

Right click on files. Get shareable link. (gdrive list doesn't work in this case, as it is not in your drive, but someone else’s. Could look into copying it over to your repo, and getting the list from there.)

Paste all the links into a plain-text file, and then remove https://drive.google.com/open?id= from the start of all the links.

Then you can sequentially download the list of google drive files with the following command: while read p; do ./gdrive download $p; done <files.txt

Upvotes: 1

Related Questions