Reputation: 1652
Here is how I loaded the data which are 2 folders with image data:
train_ds = tf.keras.preprocessing.image_dataset_from_directory(
main_folder,
validation_split=0.1,
subset="training",
seed=123,
image_size=(dim, dim))
val_ds = tf.keras.preprocessing.image_dataset_from_directory(
main_folder,
validation_split=0.1,
subset="validation",
seed=123,
image_size=(dim, dim))
The loading of the training data from the folder gives
Found 6457 files belonging to 2 classes.
Using 5812 files for training.
Found 6457 files belonging to 2 classes.
Using 645 files for validation.
Here is how I trained my model:
model = tf.keras.models.Sequential([
tf.keras.layers.experimental.preprocessing.Rescaling(1. / 255),
tf.keras.layers.Conv2D(16, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Conv2D(32, (3, 3), activation='relu', padding='same'),
tf.keras.layers.MaxPooling2D(2, 2),
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dropout(0.2),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss=tf.losses.BinaryCrossentropy(from_logits=True), optimizer="adam", metrics=["accuracy"])
es = EarlyStopping(monitor='val_accuracy', min_delta=0.1, patience=5)
model.fit(
train_ds,
validation_data=val_ds,
epochs=epc,
callbacks=[es])
Here is how I got the results:
y_pred = model.predict(val_ds)
predicted_categories = tf.argmax(y_pred, axis=1)
true_categories = tf.concat([y for x, y in val_ds], axis=0)
print(classification_report(true_categories, predicted_categories ))
The Contradicting outputs are:
Epoch 1/100
182/182 [==============================] - 8s 44ms/step - loss: 0.6617 - accuracy: 0.5139 - val_loss: 0.6466 - val_accuracy: 0.3442
Epoch 2/100
182/182 [==============================] - 8s 46ms/step - loss: 0.6613 - accuracy: 0.5712 - val_loss: 0.6460 - val_accuracy: 0.6558
Epoch 3/100
182/182 [==============================] - 8s 44ms/step - loss: 0.6611 - accuracy: 0.5594 - val_loss: 0.6474 - val_accuracy: 0.3442
Epoch 4/100
182/182 [==============================] - 8s 46ms/step - loss: 0.6315 - accuracy: 0.6504 - val_loss: 0.4623 - val_accuracy: 0.9690
Epoch 5/100
182/182 [==============================] - 8s 46ms/step - loss: 0.4780 - accuracy: 0.9554 - val_loss: 0.4597 - val_accuracy: 0.9690
Epoch 6/100
182/182 [==============================] - 8s 45ms/step - loss: 0.4831 - accuracy: 0.9434 - val_loss: 0.4517 - val_accuracy: 0.9845
Epoch 7/100
182/182 [==============================] - 8s 45ms/step - loss: 0.4720 - accuracy: 0.9658 - val_loss: 0.4546 - val_accuracy: 0.9736
Epoch 8/100
182/182 [==============================] - 8s 44ms/step - loss: 0.4719 - accuracy: 0.9652 - val_loss: 0.4507 - val_accuracy: 0.9860
Epoch 9/100
182/182 [==============================] - 8s 44ms/step - loss: 0.4747 - accuracy: 0.9597 - val_loss: 0.4528 - val_accuracy: 0.9814
precision recall f1-score support
0 0.34 1.00 0.51 222
1 0.00 0.00 0.00 423
accuracy 0.34 645
macro avg 0.17 0.50 0.26 645
weighted avg 0.12 0.34 0.18 645
Otherwise, I get a different answer every time I execute it
Can someone please please why is the classification report has an accuracy of 34% while the model val_accuracy is 0.94%?
Upvotes: 0
Views: 845
Reputation: 5079
tf.keras.preprocessing.image_dataset_from_directory
method has a parameter called label_mode
and its default value is int
which is suitable for sparse_categoricalcrossentropy
etc. It should be changed into label_model = binary
if the classification is binary classification.
Contradiction is here:
tf.keras.layers.Dense(1, activation='sigmoid')
predicted_categories = tf.argmax(y_pred, axis=1)
With sigmoid
your outputs consist of a list with a shape of (1,)
. And when you take argmax
of that list it is always returning zero as index because of the list has only one index. So you need to apply some threshold method when using sigmoid
. Sigmoid squeezes outputs into a range of [0,1]. So you can do:
predicted_categories = [1 * (x[0]>=0.5) for x in y_pred]
or using numpy:
predicted_categories = np.where(y_pred > 0.5, 1, 0)
where 0.5 is the threshold.
This means if predicted value is bigger than 0.5
then it will belong to second class. You can adjust the threshold depending on your needings.
Upvotes: 3