How to perform prediction using predict_generator on unlabeled test data in Keras?

Question

I'm trying to build an image classification model. It's a 4 class image classification. Here is my code for building image generators and running the training:

train_datagen = ImageDataGenerator(rescale=1./255.,
                               rotation_range=30,
                               horizontal_flip=True,
                               validation_split=0.1)


train_generator = image_gen.flow_from_directory(train_dir, target_size=(299, 299), 
                                                class_mode='categorical', batch_size=20,
                                                subset='training')

validation_generator = image_gen.flow_from_directory(train_dir, target_size=(299, 299), 
                                                class_mode='categorical', batch_size=20,
                                                subset='validation')

model.compile(Adam(learning_rate=0.001), loss='categorical_crossentropy',
                                         metrics=['accuracy'])

model.fit_generator(train_generator, steps_per_epoch=int(440/20), epochs=20, 
                              validation_data=validation_generator, 
                              validation_steps=int(42/20))

I was able to get train and validation work perfectly because the images in train directory are stored in a separate folder for each class. But, as you can see below, the test directory has 100 images and no folders inside it. It also doesn't have any labels and only contains image files.

How can I do prediction on the image files in test folder using Keras?

today · Accepted Answer

If you are interested to only perform prediction, you can load the images by a simple hack like this:

test_datagen = ImageDataGenerator(rescale=1/255.)

test_generator = test_datagen('PATH_TO_DATASET_DIR/Dataset',
                              # only read images from `test` directory
                              classes=['test'],
                              # don't generate labels
                              class_mode=None,
                              # don't shuffle
                              shuffle=False,
                              # use same size as in training
                              target_size=(299, 299))

preds = model.predict_generator(test_generator)

You can access test_generator.filenames to get a list of corresponding filenames so that you can map them to their corresponding prediction.

Update (as requested in comments section): if you want to map predicted classes to filenames, first you must find the predicted classes. If your model is a classification model, then probably it has a softmax layer as the classifier. So the values in preds would be probabilities. Use np.argmax method to find the index with highest probability:

preds_cls_idx = preds.argmax(axis=-1)

So this gives you the indices of predicted classes. Now we need to map indices to their string labels (i.e. "car", "bike", etc.) which are provided by training generator in class_indices attribute:

import numpy as np

idx_to_cls = {v: k for k, v in train_generator.class_indices.items()}
preds_cls = np.vectorize(idx_to_cls.get)(preds_cls_idx)
filenames_to_cls = list(zip(test_generator.filenames, preds_cls))

How to perform prediction using predict_generator on unlabeled test data in Keras?

Answers (2)

Related Questions