Vinson Ciawandy
Vinson Ciawandy

Reputation: 1166

filter class/subfolder with pytorch ImageFolder

Here's my folder structure

image-folders/
   ├── class_0/
   |   ├── 001.jpg
   |   ├── 002.jpg
   └── class_1/
   |   ├── 001.jpg
   |   └── 002.jpg
   └── class_2/
       ├── 001.jpg
       └── 002.jpg

By using ImageFolder from torchvision, I can create dataset with this syntax :
dataset = ImageFolder("image-folders",...)

But this will read the entire subfolder and create 3 target classes. I don't want to include the class_2 folder, I want my dataset to only contains class_0 and class_1 only, is there any way to achieve this besides delete/move the class_2 folder?

Upvotes: 7

Views: 5595

Answers (1)

Shai
Shai

Reputation: 114926

You can do this by using torch.utils.data.Subset of the original full ImageFolder dataset:

from torchvision.datasets import ImageFolder
from torch.utils.data import Subset

# construct the full dataset
dataset = ImageFolder("image-folders",...)
# select the indices of all other folders
idx = [i for i in range(len(dataset)) if dataset.imgs[i][1] != dataset.class_to_idx['class_s']]
# build the appropriate subset
subset = Subset(dataset, idx)

Upvotes: 10

Related Questions