Luca Guarro
Luca Guarro

Reputation: 1168

Pickle reports FileNotFoundError for file that definitely exists in Goolge Colab

I am attempting to pickle.load three files that definitely exist within my Google Drive yet I am not able to load them because my code is not able to find them.

Here is the error message:

FileNotFoundError: [Errno 2] Failed to open local file '/root/.cache/huggingface/datasets/good_reads_practice_dataset/main_domain/1.1.0/5f8cad709a7746be18b722642fc8ade5c1cedfa3b259440a401f7f4701079561/cache-af18ded1f5c5aaac.arrow'. Detail: [errno 2] No such file or directory

This is my code that is not working:

with open(r"/content/drive/MyDrive/Thesis/Datasets/book_preprocessing/PreTokenized/ALBERT_NER_512/train_dataset.pkl", "rb") as input_file:
  train_dataset = pickle.load(input_file)

with open(r"/content/drive/MyDrive/Thesis/Datasets/book_preprocessing/PreTokenized/ALBERT_NER_512/val_dataset.pkl", "rb") as input_file:
  val_dataset = pickle.load(input_file)

with open(r"/content/drive/MyDrive/Thesis/Datasets/book_preprocessing/PreTokenized/ALBERT_NER_512/test_dataset.pkl", "rb") as input_file:
  test_dataset = pickle.load(input_file)

And here is a screenshot of my google drive directory tree: enter image description here

I have checked the names of the files many times so unless I am going crazy the files as I have pointed to them definitely exist. Moreover I am sure that my code is correct as I have loaded pickle objects in all the other folders in the 'PreTokenized' folder several times without issue which makes this bug even more mysterious. I also have no clue as to why it is looking for the pickled objects in the cache of the google colab environment. If anyone has an idea as to what is happening and how I can solve it, I would greatly appreciate any help.

Upvotes: 2

Views: 243

Answers (1)

Luca Guarro
Luca Guarro

Reputation: 1168

Still don't know why this is happening exactly but I did realize that my files were not actually being uploaded since they were too big. (I did not get an error or warning of this but the pickled files were only a few Kbs so I knew something was up.

What I did to get around this was to segment the files in halves and then upon downloading, I would concatenate them back together.

Upvotes: 0

Related Questions