Reputation: 637
Keras image_dataset_from_directory
inside the preprocessing
module takes a path as an argument and automatically infers the classes when those images are stored in separate subfolders. In my case, however, I have a single folder and image classes are then specified in a DataFrame.
.
├── datasets
│ ├── sample_submit.csv
│ ├── test_images
│ │ ├── test_0000.jpg
│ │ ├── test_0001.jpg
│ │ ├── test_0002.jpg
│ │ └── ...
│ ├── test_images.csv
│ ├── train_images
│ │ ├── train_0000.jpg
│ │ ├── train_0001.jpg
│ │ ├── train_0002.jpg
│ │ └── ...
│ └── train_images.csv
└── model.py
Tensorflow's documentation specifies that when you are not inferring the labels, a list or tuple must be specified, which I get from the DataFrame df
. However, when I specify the image folder, TensorFlow returns a ValueError
because it has found no images:
In [1]: df = pd.read_csv('datasets/train_images.csv')
...: tds = keras.preprocessing\
...: .image_dataset_from_directory('datasets/train_images', list(df['class']),
...: validation_split=0.2, subset='training',
...: seed=123, image_size(180, 180))
ValueError: Expected the lengths of `labels` to match the number of files in the target directory. len(labels) is 1102 while we found 0 files in datasets/train_images.
Why does keras not recognise the images within the folder? I have tried setting the "full" relative path with ./datasets/train_images
, adding a slash with datasets/train_images/
and also the absolute path, to no avail. What is missing here? Alternatively, is there a more efficient approach in this case where I can still get the train/test split?
EDIT: It seems there is a limitation with keras and this question originally laid it out, but remained too vague to get to the heart of the matter.
Plain and clear: keras seems to always scrape the subfolders of the directory
argument for images and build the dataset. The workaround to enable the loading of images is to wrap an additional folder (outer_train
) and pass it to directory
.
However, I still have problems with this approach, because now keras seems unable to take the custom classes passed as a list and outputs Found 1102 files belonging to 1 classes.
(in this case, the name of the now subfolder train_images
), so any help is still appreciated.
Upvotes: 2
Views: 1886
Reputation: 166
keras seems to always scrape the subfolders of the directory argument for images and build the dataset. The workaround to enable the loading of images is to wrap an additional folder (outer_train) and pass it to directory.
The problem is the image_dataset_from_directory
method asks a directory which contains other directories and starts getting the images inside the directories present from the directory you gave in input.
However, I still have problems with this approach, because now keras seems unable to take the custom classes passed as a list and outputs.
I don't think you can read images like that. If you want the method to read the images of a custom class then you have to place the folder with the custom class of images inside the folder you want to read like this:
directory_to_read/
class__1/
img1
img2
class__2/
img1
img2
custom_class/
img1
img2
Upvotes: 2