waspinator
waspinator

Reputation: 6826

get true labels from keras generator

I want to use sklearn.metrics.confusion_matrix(y_true, y_pred) to create a confusion matrix for a keras model.

After training a model I can use predict_generator(generator) to get predictions for a test dataset, which gives me y_pred. How can I get the corresponding true labels, y_true from a data generator?

Upvotes: 4

Views: 7188

Answers (2)

waspinator
waspinator

Reputation: 6826

After creating a data generator, either your own or the built in ImageDataGenerator, use your trained model to make predictions:

true_labels = data_generator.classes
predictions = model.predict_generator(data_generator)

sklearn's confusion matrix expects a 1-d array of labels, so you have to convert your predictions using np.argmax()

y_true = true_labels
y_pred = np.array([np.argmax(x) for x in predictions])

Then you can use those variables directly in the confusion_matrix function

cm = sklearn.metrics.confusion_matrix(y_true, y_pred)

And you can plot it using the example plot_confusion_matrix() function found here:

https://scikit-learn.org/stable/auto_examples/model_selection/plot_confusion_matrix.html

enter image description here

Upvotes: 1

Karl
Karl

Reputation: 5822

generator.classes will give you observed values in sparse format. You probably need it in dense (i.e., one-hot encoded format). You could get that with:

import pandas as pd
pd.get_dummies(pd.Series(generator.classes)).to_dense()

NOTE though: you must set the generator's shuffle attribute to False before generating the predictions and fetching the observed classes, otherwise your predictions and observations will not line up!

Upvotes: 4

Related Questions