Reputation: 7056
I have a few datasets in folders, I am concating them using concat datasets. So, I have data folders like so (note that folders 1 and 2 only have 1 class rather than 2):
-
denotes subfolders
folder0
-cats
-dogs
folder1
-cats
folder2
-cats
folder3
-dogs
and then I do this:
trainset1 = datasets.ImageFolder(folder0, loader=my_loader, transform=SomeAug())
trainset2 = datasets.ImageFolder(folder1, loader=my_loader, transform=SomeAug())
trainset3 = datasets.ImageFolder(folder2, loader=my_loader, transform=SomeAug())
trainset = torch.utils.data.ConcatDataset([trainset1, trainset2, trainset3])
Is this the legit way of doing this? When I look at the total images via:
len(train_loader.dataset))
it adds up correctly.
However, when I do:
print(trainset.classes)
it throws me:
AttributeError: 'ConcatDataset' object has no attribute 'classes'
which it does not when I use just one dataset.
I just wanted to ensure that there no gotchas in using thie concat dataset method.
Upvotes: 2
Views: 2650
Reputation: 3958
ImageFolder
inherits from DatasetFolder
which has a class method find_classes
that is called in the constructor to initialize the variable DatasetFolder.classes
. Thus, you can call trainset.classes
without error.
However, ConcatDataset
does not inherit from ImageFolder
and more generally does not implement the classes
variable by default. In general, it would be difficult to do this because the ImageFolder
method for finding classes relies on a specific file structure, whereas ConcatDataset
doesn't assume such a file structure such that it can work with a more general set of datasets.
If this functionality is essential to you you could write a simple dataset type that inherits from ConcatDataset
, expects ImageFolder
datasets specifically, and stores the classes as a union of the possible classes from each constituent dataset.
Upvotes: 2