Stat Tistician
Stat Tistician

Reputation: 883

Download file from Kaggle to Google Colab

I want to download the sign language dataset from Kaggle to my Colab.

So far I always used wget and the specific zip file link, for example:

!wget --no-check-certificate \
    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/rps.zip \
    -O /tmp/rps.zip

However, when I right-click the download button at Kaggle and select copy link to get the path copied to my clipboard and I output it I get:

https://www.kaggle.com/datamunge/sign-language-mnist/download

When I use this link in my browser I am asked to download it. I can see that the filename is 3258_5337_bundle_archive.zip

So I tried:

!wget --no-check-certificate \
        https://www.kaggle.com/datamunge/sign-language-mnist/download3258_5337_bundle_archive.zip  \
        -O /tmp/kds.zip

and also tried:

 !wget --no-check-certificate \
            https://www.kaggle.com/datamunge/sign-language-mnist/download3258_5337_bundle_archive.zip  \
            -O /tmp/kds.zip

I get as output:

exa

So it does not work. File coudln't be found or the returned zip archive is not 101mb large, but just a few kb. Also when trying to unzip it, it does not work.

How can I download this file into my colab (directly with wget?)?

Upvotes: 6

Views: 3134

Answers (2)

Ruston
Ruston

Reputation: 156

This is the simplest way I came up to do it (if you participate in competition just change datasets to competitions):

import os

os.environ['KAGGLE_USERNAME'] = "xxxx"

os.environ['KAGGLE_KEY'] = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx"

!kaggle datasets download -d iarunava/happy-house-dataset

Upvotes: 1

rchurt
rchurt

Reputation: 1533

Kaggle recommends using their own API instead of wget or rsync.

First, make an API token for Kaggle. On Kaggle's website go to "My Account", Scroll to API section and click on "Create New API Token" - It will download kaggle.json file on your machine.

Then run the following in Google Colab:

from google.colab import files
files.upload() # Browse for the kaggle.json file that you downloaded

# Make directory named kaggle, copy kaggle.json file there, and change the permissions of the file.
! mkdir ~/.kaggle
! cp kaggle.json ~/.kaggle/
! chmod 600 ~/.kaggle/kaggle.json

# You can check if everything's okay by running this command.
! kaggle datasets list

# Download and unzip sign-language-mnist dataset into '/usr/local'
! kaggle datasets download -d datamunge/sign-language-mnist --path '/usr/local' --unzip

Used info from here: https://www.kaggle.com/general/74235

Upvotes: 11

Related Questions