Aritro Pal Choudhury
Aritro Pal Choudhury

Reputation: 113

pytorch torchvision.datasets.ImageFolder FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints

Tried to load training data with pytorch torch.datasets.ImageFolder in Colab.

transform = transforms.Compose([transforms.Resize(400),
                                transforms.ToTensor()])
dataset_path = 'ss/'
dataset = datasets.ImageFolder(root=dataset_path, transform=transform)
dataloader = torch.utils.data.DataLoader(dataset, batch_size=20)

I encountered the following error :

---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
<ipython-input-27-7abcc1f434b1> in <module>()
      2                                 transforms.ToTensor()])
      3 dataset_path = 'ss/'
----> 4 dataset = datasets.ImageFolder(root=dataset_path, transform=transform)
      5 dataloader = torch.utils.data.DataLoader(dataset, batch_size=20)

3 frames
/usr/local/lib/python3.7/dist-packages/torchvision/datasets/folder.py in make_dataset(directory, class_to_idx, extensions, is_valid_file)
    100         if extensions is not None:
    101             msg += f"Supported extensions are: {', '.join(extensions)}"
--> 102         raise FileNotFoundError(msg)
    103 
    104     return instances

FileNotFoundError: Found no valid file for the classes .ipynb_checkpoints. Supported extensions are: .jpg, .jpeg, .png, .ppm, .bmp, .pgm, .tif, .tiff, .webp

My Dataset folder contains a subfolder with many training images in png format, still the ImageFolder can't access them.

Upvotes: 11

Views: 28010

Answers (6)

Boxin Zhang
Boxin Zhang

Reputation: 11

I encountered this problem while using colab, too. Tried all the possible solutions provided here but didn't work. Turned out that the existence of the hidden ".ipynb_checkpoints" file was the reason. And the only way to get rid of the ".ipynb_checkpoints" file was creating folders by coding instead of manually creating them. Because the ".ipynb_checkpoints" file can not be deleted if the folders are manually created. Here is how I code to create folders to solve this issue. You can change the folder names as you want.

from pathlib import Path

data_path = Path("image/")

image_path_train = data_path / "train"
image_path_train_A = image_path_train / "A"
image_path_train_B = image_path_train / "B"

image_path_test = data_path / "test"
image_path_test_A = image_path_test / "A"
image_path_test_B = image_path_test / "B"


image_path_train.mkdir(parents=True, exist_ok=True)
image_path_test.mkdir(parents=True, exist_ok=True)

image_path_train_A.mkdir(parents=True, exist_ok=True)
image_path_train_B.mkdir(parents=True, exist_ok=True)

image_path_test_A.mkdir(parents=True, exist_ok=True)
image_path_test_B.mkdir(parents=True, exist_ok=True)

Upvotes: 1

PreciseGradient9751
PreciseGradient9751

Reputation: 9

I got the same error recently. Turns out it was a directory structure issue. Personally I was using this with ImageLoader so make sure your structure looks like this. (Note I also used google colab):

data

  • data_you_are_trying_to_train
    • train
      • class_1
    • test
      • class_1

Upvotes: 0

The solution for google colaboratory:
When you creating a directory, coollaboratory additionally creates .ipynb_checkpoints in it.
To solve the problem, it is enough to remove it from the folder containing directories with images (i.e. from the train folder). You need to run:

!rm -R test/train/.ipynb_checkpoints
!ls test/train/ -a   #to make sure that the deletion has occurred

where test/train/ is my path to datasets folders

Upvotes: 2

Mohamed Djebbar
Mohamed Djebbar

Reputation: 1

1- The files in the image folder need to be placed in the subfolders for each class (as said Sergii Dymchenko)

2- Put the absolute path when using google colab

Upvotes: 0

Y. Zhang
Y. Zhang

Reputation: 216

I encountered the same problem when I was using IPython notebook-like tools.

First please check if there is any hidden files under your dataset_path. Use ls -a if you are under a Linux environment.

The case happen to me is I found a hidden file called .ipynb_checkpoints which is located parallelly to image class subfolders. I think that file causes confusion to PyTorch dataset. I made sure it is not useful so I simply deleted it. Then the dataset works fine.

Or if you would like to simply ignore that file, you may also try this.

Upvotes: 20

Sergii Dymchenko
Sergii Dymchenko

Reputation: 7229

The files in the image folder need to be placed in the subfolders for each class, like this:

root/dog/xxx.png
root/dog/xxy.png
root/dog/[...]/xxz.png

root/cat/123.png
root/cat/nsdf3.png
root/cat/[...]/asd932_.png

https://pytorch.org/vision/stable/datasets.html#torchvision.datasets.ImageFolder

Are your files in ss dir organized in this way?

Upvotes: 1

Related Questions