Reputation: 173
I have been using TF2.0 recently. I have trained a simple CNN model (with Keras Sequential API) for binary classification of images. I have used tf.data.Dataset for loading the images from disk. Actually the model got pretty good accuracy, with train binary_accuracy: 0.9831 and validation binary_accuracy: 0.9494.
Tried evaluating the model using model.evaluate(). It gave binary accuracy of 0.9460. But when I tried to calculate binary accuracy manually using predict_classes(), I get around 0.384. I dont know what was the issue. Please help me out.
I have added my code used for compiling and training the model. Also the code for evaluating my model.
train_data = tf.data.Dataset.from_tensor_slices((tf.constant(train_x),tf.constant(train_y)))
val_data = tf.data.Dataset.from_tensor_slices((tf.constant(val_x),tf.constant(val_y)))
train_data = train_data.map(preproc).shuffle(buffer_size=100).batch(BATCH_SIZE)
val_data = val_data.map(preproc).shuffle(buffer_size=100).batch(BATCH_SIZE)
model.compile(optimizer=Adam(learning_rate=0.0001),
loss='binary_crossentropy',
metrics=[tf.keras.metrics.BinaryAccuracy()])
checkpointer = ModelCheckpoint(filepath='weights.hdf5', verbose=1, save_best_only=True)
time1 = time.time()
history = model.fit(train_data.repeat(),
epochs=EPOCHS,
steps_per_epoch=STEPS_PER_EPOCH,
validation_data=val_data.repeat(),
validation_steps=VAL_STEPS,
callbacks=[checkpointer])
29/29 [==============================] - 116s 4s/step - loss: 0.0634 - binary_accuracy: 0.9826 - val_loss: 0.1559 - val_binary_accuracy: 0.9494
Now testing with unseen data
test_data = tf.data.Dataset.from_tensor_slices((tf.constant(unseen_faces),tf.constant(unseen_labels)))
test_data = test_data.map(preproc).batch(BATCH_SIZE)
model.evaluate(test_data)
9/9 [==============================] - 19s 2s/step - loss: 0.1689 - binary_accuracy: 0.9460
The same model, when I tried to calculate accuracy using model.predict_classes with same dataset, the prediction results are far from the evaluation report. The binary accuracy comes around 38%.
Edit 1: Pre-processing function I used while training
def preproc(file_path,label):
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img)
img = (tf.cast(img, tf.float32)/127.5) - 1
return tf.image.resize(img,(IMAGE_HEIGHT,IMAGE_WIDTH)),label
Manual prediction code
from sklearn.metrics import classification_report
#Testing preprocessing function
def preproc_test(file_path):
img = tf.io.read_file(file_path)
img = tf.image.decode_jpeg(img)
img = (tf.cast(img, tf.float32)/127.5) - 1
return tf.image.resize(img,(IMAGE_HEIGHT,IMAGE_WIDTH))
unseen_faces = []
unseen_labels = []
for im_path in glob.glob('dataset/data/*'):
unseen_faces.append(im_path)
if 'real' in i:
unseen_labels.append(0)
else:
unseen_labels.append(1)
unseen_faces = list(map(preproc_test,unseen_faces))
unseen_faces = tf.stack(unseen_faces)
predicted_labels = model.predict_classes(unseen_faces)
print(classification_report(unseen_labels,predicted_labels,[0,1]))
precision recall f1-score support
0 0.54 0.41 0.47 34
1 0.41 0.54 0.47 26
accuracy 0.47 60
macro avg 0.48 0.48 0.47 60
weighted avg 0.48 0.47 0.47 60
Upvotes: 2
Views: 8381
Reputation: 1674
In my case it is because the shape of my ground truth and predicted results are different. I was loading data by (x_train, y_train), (x_test, y_test) = cifar10.load_data()
, where the y_train
is a 2d ndarray of shape (50000,1)
yet the prediction from model.predict_classes
is of shape (50000,)
. If I directly compare them by np.mean(pred==y_train)
I would have a result of 0.1
which is not correct. Instead np.mean(pred==np.squeeze(y_train))
gives the correct result.
Upvotes: 1
Reputation: 1633
Your model is doing good both during training
and testing
. Evaluation accuracy comes on the basis of prediction so maybe you are making some logical mistake while using model.predict_classes()
. Please check if you are using the trained model weights and not any randomly initialized model while evaluating it.
evaluate
: The model will set apart this fraction of the training data, will not train on it, and will evaluate the loss and any model metrics on this data at the end of each epoch. model.evaluate()
is for evaluating your trained model. Its output is accuracy or loss, not prediction to your input data.
predict
: Generates output predictions for the input samples. model.predict()
actually predicts, and its output is target value, predicted from your input data.
P.S.: For binary classification problem accuracy <=50% is worse than a random guess.
Upvotes: 1