Reputation: 11
I'm experimenting on colab in image-classification with images of 32x32 pixels; I have 248 pics for training and 62 for testing (much too less, I know, but for experimenting...). There are only two classes and I get the data as follows:
train_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True)
training_set = train_datagen.flow_from_directory(
'training_set', target_size=(32,32),
class_mode='binary')
test_datagen = tf.keras.preprocessing.image.ImageDataGenerator(
rescale=1./255)
test_set = test_datagen.flow_from_directory(
'test_set', target_size=(32,32),
class_mode='binary')
my actual cnn architecture is this:
cnn = tf.keras.models.Sequential([
tf.keras.layers.Conv2D(64, 3, activation='relu', input_shape=[32,32,3]),
tf.keras.layers.AveragePooling2D(2),
tf.keras.layers.Conv2D(64, 3, activation='relu'),
tf.keras.layers.AveragePooling2D(2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(64, activation='relu'),
tf.keras.layers.Dropout(0.5),
tf.keras.layers.Dense(1, activation='sigmoid'),
])
and for compiling:
cnn.compile(optimizer='adam',loss='binary_crossentropy',
metrics=['accuracy'])
training:
hist = cnn.fit(x=training_set, validation_data=test_set, epochs=30)
after 30 epochs, the model gives:
Epoch 30/30 8/8 [==============================] - 1s 168ms/step - loss: 0.4237 - accuracy: 0.8347 - val_loss: 0.5812 - val_accuracy: 0.7419
I evaluated on the test data:
cnn.evaluate(test_set)
which gave me:
2/2 [==============================] - 0s 80ms/step - loss: 0.5812 - accuracy: 0.7419
[0.5812247395515442, 0.7419354915618896]
this would be nice for such a small dataset, but checking the results with a classification report from sklearn gives a much lower value (which is correct) of only 0.48 accuracy. To get this value, i did
predictions = cnn.predict(test_set)
I transformed the probability values in predictions to 0 or 1 (threshold 0.5) to get the predicted labels and compared these with the correct labels of the test data in the classification_report:
from sklearn.metrics import confusion_matrix, classification_report
print(classification_report(test_labels, predicted_labels))
the report showed
precision recall f1-score support
0 0.48 0.52 0.50 31
1 0.48 0.45 0.47 31
accuracy 0.48 62
macro avg 0.48 0.48 0.48 62
weighted avg 0.48 0.48 0.48 62
so why the model.evaluate(...)
function cannot calculate the correct accuracy or otherwise: what exactly does this evaluate function calculate? what is the meaning of this number 0.7419?
does anybody have an idea for this problem?
Upvotes: 1
Views: 756
Reputation: 29
You can define a new test generator but this time set shuffle to False.
new_test_datagen = ImageDataGenerator(rescale=1./255)
new_test_generator = test_datagen.flow_from_directory(test_dir,
target_size=(150,150),
shuffle = False,
batch_size=32,
seed=None)
# Display classification report and accuracy score for softmax classifier
from sklearn.metrics import classification_report, accuracy_score
softmax_y_true = new_test_generator.classes
softmax_y_pred = model.predict(new_test_generator)
softmax_y_pred = np.array(list(map(lambda x: np.argmax(x),softmax_y_pred)))
print("Accuracy: {0}".format(accuracy_score(softmax_y_true, softmax_y_pred)))
Upvotes: 0
Reputation: 11
I've found the very hided reason for this problem. it lies in the sequence of getting the list of all test_labels (the truth) and doing predictions on the test data by running model.predict(test_set).
I found that the method predict(test_set) mixes up the content of test_set !
So I saved the labels of the test_set BEFORE doing the predict(test_set) and now I have a perfect match between the accuracy in my classification_report and the accuracy from the method evaluate(test_set)/val_accuracy.
I also did predict on each single object in test_set and calculated the accuracy by myself, and this accuracy matched also with val_accuracy from last epoch.
by the way: the method evaluate(test_set) also mixes up the content of test_set ! so one has to be very careful when extracting data from test_set "manually"
Upvotes: 0