Kaggle API in colab datasets `!kaggle datasets list` error

I have a problem, i don't understand this error, when trying to list kaggles datasets in google colab.

Notebook config: Python 3.x, no hdw acc.

#to upload my kaggle token
from google.colab import files
files.upload()

#setting up the token
!pip install --upgrade kaggle
!mkdir -p ~/.kaggle
!cp kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

#and taking a look at datasets
!kaggle datasets list

Traceback (most recent call last):
      File "/usr/local/bin/kaggle", line 8, in <module>
        sys.exit(main())
      File "/usr/local/lib/python3.6/dist-packages/kaggle/cli.py", line 51, in main
        out = args.func(**command_args)
      File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 940, in dataset_list_cli
        max_size, min_size)
      File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 905, in dataset_list
        return [Dataset(d) for d in datasets_list_result]
      File "/usr/local/lib/python3.6/dist-packages/kaggle/api/kaggle_api_extended.py", line 905, in <listcomp>
        return [Dataset(d) for d in datasets_list_result]
      File "/usr/local/lib/python3.6/dist-packages/kaggle/models/kaggle_models_extended.py", line 67, in __init__
        self.size = File.get_size(self.totalBytes)
      File "/usr/local/lib/python3.6/dist-packages/kaggle/models/kaggle_models_extended.py", line 107, in get_size
        while size >= 1024 and suffix_index < 4:
    TypeError: '>=' not supported between instances of 'NoneType' and 'int'

well, I would like to understand what happened, and how to fix it. Thank's in the advance.

jet.

Upvotes: 0

Views: 2572

Answers (2)

Andy White
Andy White

Reputation: 81

I ran into the same problem.

  1. Generate the Kaggle JSON API file. On the Widget/Icon in the top Right corner -> click "Account" -> Scroll down to "API" subsection, Click "Expire API Token" -> Click "Create New API Token"
  2. In Google Colab. enter image description here Upload your json file
  3. Run the following code:

#first upload kaggle api file "kaggle.json" import os #this path contains the json file os.environ['KAGGLE_CONFIG_DIR'] = "/content"

#Find the competition or Dataset under Data. Like this: !kaggle competitions download -c jane-street-market-prediction

This worked for me after a lot of banging my head against the wall.

If you get errors still, you may need to link your Colab and Kaggle accounts. You can do this in the account settings portion of kaggle.

Upvotes: 0

Mobile Ben
Mobile Ben

Reputation: 7341

I am encountering this problem as well. I noticed that if I set the use this call

kaggle datasets list --min-size 1

It will work. Note you will need version 1.5.6. I had 1.5.4 on a Colab instance and that version didn’t support that argument.

The problem seems to be bigquery/crypto-litecoin has no data. As a consequence of this, it looks like totalBytes is None in Dataset.

I've opened an issue on github and will created a PR. If you want a temporary work around, you can grab the file from my fork. You can use your traceback to determine where to put the file. Or alternatively, just use --min-size 1 so it will ignore the case when there are no data files.

Upvotes: 5

Related Questions