Reputation: 35
I am using fer2013 dataset, and when I am using datagen.flow_from_directory function, it doesn't find all the images from the directory.
Here's my code
IMAGE_SIZE = 224
BATCH_SIZE = 64
train_data_dir = "/content/drive/My Drive/Colab/FER2013/Training"
validation_data_dir = "/content/drive/My Drive/Colab/FER2013/PublicTest"
datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
validation_split=0.2)
train_generator = datagen.flow_from_directory(
train_data_dir,
target_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE,
subset='training')
val_generator = datagen.flow_from_directory(
validation_data_dir,
target_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=BATCH_SIZE,
subset='validation')
Here's the result.
Found 22921 images belonging to 7 classes.
Found 714 images belonging to 7 classes.
I don't have an err per se, but in the directory folder I have 28000+ images and in PublicTest 3000+ so, why it finds me only 22921 and 714 insted of my actual number of images?
Upvotes: 1
Views: 3223
Reputation: 8112
Apparently you have a separate directory for training and a separate directory for validation images. Each should have 7 sub directories one for each class and named identically in training and validation directories. In the data generator you set the validation_split=0.2. This is going to take your training images and dedicate 80% of them to training and 20% to validation. So roughly 28000 X .8 = 22400. Since you have a separate validation directory already you should set the split=0. That way all the images in the training directory will be used for training. With the validation_split=0 you do not need to specify subset in the flow_from_directory methods. Feed both the the train_generator and val_generator into model.fit.
Upvotes: 1