Reputation: 35
When the same test dataset is fed into the trained model to perform evaluation. Different accuracies are returned each time. What would be the reason? Any solution to fix it?
directory is shown as below ,each class has 200-300 images
dataset
|_ class1
|_ class2
|_ class3
......
The code is shown as below
#import dataset
dataset_path = '<directory>'
DIR = pathlib.Path(dataset_path)
#validation set
validation_set = tf.keras.preprocessing.image_dataset_from_directory(
DIR,
validation_split=0.2,
subset="validation",
seed=123,
image_size=(150, 150),
batch_size=32)
#test set
val_batches = tf.data.experimental.cardinality(validation_set)
test_set = validation_set.take(val_batches // 5)
validation_set = validation_set.skip(val_batches // 5)
#build model and train model
#......
#At the end of the training
#accuracy:0.9047 val_accuracy:0.8942
#evaluate model
loss, accuracy = model.evaluate(test_set)
#I run this line of code several times and each time it returns different accuracy
#Says, the accuracies may range from 0.902 to 0.934
Upvotes: 0
Views: 66
Reputation: 319
You're right, I managed to reproduce your results. The problem arises only when you use a GPU because some GPU operations are not deterministic. Try to set:
import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"
Or alternatively, if you use google colab, Runtime>Change Runtime Type>Hardware accelerator>None
Upvotes: 0
Reputation: 319
Where is your model defined? If you define the model with random initialization (i.e., you don't specify initialization) and the model is created every time you launch the script, it's normal that you get different accuracy.
Initialization plays a role: if you start with 2 differently initialized models, after training you will usually get two similar but still different models
Upvotes: 1