charelf
charelf

Reputation: 3825

load image dataset (folder or zip) located in Google Drive to Google Colab?

I have a dataset of images on my Google Drive. I have this dataset both in a compressed .zip version and an uncompressed folder.

I want to train a CNN using Google Colab. How can I tell Colab where the images in my Google Drive are?

  1. official tutorial does not help me as it only shows how to upload single files, not a folder with 10000 images as in my case.

  2. Then I found this answer, but the solution is not finished, or at least I did not understand how to go on from unzipping. Unfortunately I am unable to comment this answer as I don't have enough "stackoverflow points"

  3. I also found this thread, but here all the answer use other tools, such as Github or dropbox

I hope someone could explain me what I need to do or tell me where to find help.

Edit1:

I have found yet another thread asking the same question as mine: Sadly, of the 3 answers, two refer to Kaggle, which I don't know and don't use. The third answer provides two links. The first link refers to the 3rd thread I linked, and the second link only explains how to upload single files manually.

Upvotes: 15

Views: 25052

Answers (4)

Neha
Neha

Reputation: 21

I saw and tried all the above but it didn't work for me. So here is a simple solution with simple explanation that can help you load a .zip image folder and extract images from it.

  • Connect to google drive
    from google.colab import drive
    drive.mount('/content/drive')
    

(you will get a link sign in to your google account and copy the code and paste onto the code asked in the colab)

  • Install and import keras library
    !pip install -q keras
    import keras
    
    

(the zip file is loaded into the colab)

  • Unzip the folder
    ! unzip 'zip-file-path'
    

To get the path:

  • select file on left side of google colab
  • browse for the file click on the 3 dots
  • copy path

Now the unzipped image folder is loaded onto your colab use it as you wish

Upvotes: 2

RomRoc
RomRoc

Reputation: 1535

Other answers are excellent, but they require everytime to authenticate in Google Drive, that is not very comfortable if you want to run top down your notebook.

I had the same need, I wanted to download a single zip file containing dataset from Drive to Colab. I preferred to get shareable link of that file and run following cell (substitute drive_url with your shared link):

import urllib

drive_url = 'https://drive.google.com/uc?export=download&id=1fBVMX66SlvrYa0oIau1lxt1_Vy-XYZWG'
file_name = 'downloaded.zip'

urllib.request.urlretrieve(drive_url, file_name)
print('Download completed!')

Upvotes: 5

raul quijada ferrero
raul quijada ferrero

Reputation: 161

To update the answer. You can right now do it from Google Colab

# Load the Drive helper and mount
from google.colab import drive

# This will prompt for authorization.
drive.mount('/content/drive')

!ls "/content/drive/My Drive"

Google Documentation

Upvotes: 11

VeilEclipse
VeilEclipse

Reputation: 2856

As mentioned by @yl_low here

Step 1:

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse

Step 2:

from google.colab import auth
auth.authenticate_user()

Step 3:

from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

Both Step 2 and 3 will require to fill in the verification code provided by the URLs

Step 4:

!mkdir -p drive
!google-drive-ocamlfuse drive

Step 5:

print('Files in Drive:')
!ls drive/

Upvotes: 7

Related Questions