Cannot CSV Load a file in Colab Using tf.compat.v1.keras.utils.get_file

Question

I have mounted my GDrive and have csv file in a folder. I am following the tutorial. However, when I issue the tf.keras.utils.get_file(), I get a ValueError As follows.

data_folder = r"/content/drive/My Drive/NLP/project2/data"
import os
print(os.listdir(data_folder))

It returns:

['crowdsourced_labelled_dataset.csv',
 'P2_Testing_Dataset.csv',
 'P2_Training_Dataset_old.csv',
 'P2_Training_Dataset.csv']

TRAIN_DATA_URL = os.path.join(data_folder, 'P2_Training_Dataset.csv')
train_file_path = tf.compat.v1.keras.utils.get_file("train.csv", TRAIN_DATA_URL)

But this returns:

Downloading data from /content/drive/My Drive/NLP/project2/data/P2_Training_Dataset.csv
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
 in ()
      2 TRAIN_DATA_URL = os.path.join(data_folder, 'P2_Training_Dataset.csv')
      3 TEST_DATA_URL = os.path.join(data_folder, 'P2_Testing_Dataset.csv')
----> 4 train_file_path = tf.compat.v1.keras.utils.get_file("train.csv", TRAIN_DATA_URL)
      5 test_file_path = tf.compat.v1.keras.utils.get_file("eval.csv", TEST_DATA_URL)


6 frames
/usr/lib/python3.6/urllib/request.py in _parse(self)
    382         self.type, rest = splittype(self._full_url)
    383         if self.type is None:
--> 384             raise ValueError("unknown url type: %r" % self.full_url)
    385         self.host, self.selector = splithost(rest)
    386         if self.host:

ValueError: unknown url type: '/content/drive/My Drive/NLP/project2/data/P2_Training_Dataset.csv'

What am I doing wrong please?

akilat90 · Accepted Answer

As per the docs, this will be the outcome of a call to the function tf.compat.v1.keras.utils.get_file.

tf.keras.utils.get_file(
    fname,
    origin,
    untar=False,
    md5_hash=None,
    file_hash=None,
    cache_subdir='datasets',
    hash_algorithm='auto',
    extract=False,
    archive_format='auto',
    cache_dir=None
)

By default the file at the url origin is downloaded to the cache_dir ~/.keras, placed in the cache_subdir datasets, and given the filename fname. The final location of a file example.txt would therefore be ~/.keras/datasets/example.txt.

Returns: Path to the downloaded file

Since you already have the data in your drive, there's no need to download it again (and IIUC, the function is expecting an accessible URL). Also, there's no need of obtaining the file name from a function call because you already know it.

Assuming the drive is mounted, you can replace your file paths as below:

train_file_path = os.path.join(data_folder, 'P2_Training_Dataset.csv')
test_file_path = os.path.join(data_folder, 'P2_Testing_Dataset.csv')

Cannot CSV Load a file in Colab Using tf.compat.v1.keras.utils.get_file

Answers (1)

Related Questions