Reputation: 75
I am training a model to classify 3 classes of images using Tensorflow. Currently I am using the box standard ResNet50 definition:
input_layer = Input(shape=(203,147,256))
model = tf.keras.applications.resnet50.ResNet50(weights=None, input_tensor=input_layer, classes=3
Here is my model compilation:
model.compile(optimizer=tf.keras.optimizers.Adam(), #default learning_rate=0.001
loss=tf.keras.losses.CategoricalCrossentropy(),
metrics=[tf.keras.metrics.Accuracy()])
Having split my data using sklearn.model_selection.train_test_split
, I instantiate Tensorflow Dataset objects:
train_ds = tf.data.Dataset.from_tensor_slices((X_train, y_train))
val_ds = tf.data.Dataset.from_tensor_slices((X_val, y_val))
train_ds = train_ds.batch(2)
val_ds = val_ds.batch(2
And begin training:
history = model.fit(train_ds,
validation_data=val_ds,
epochs=200,
verbose=2)#verbopsity 2 means no one line per epoch
When training, I am severely overfitting, but that is an issue for later. My issue now is that my training loss is extremely small, but my training accuracy is 0 or near zero. Here is the final training epoch:
Epoch 200/200
33/33 - 3s - loss: 4.4143e-06 - accuracy: 0.0000e+00 - val_loss: 1.5500 - val_accuracy: 0.0000e+00
I thought that Tensorflow calculates accuracy based on the training loss. I have absolutely no idea why my accuracy is zero if my loss seems so well. Am I using the proper loss and metrics? Has anyone seen this behaviour before? Any help is appreciated
Addendum:
My training labels are one hot encoded, and my training and validation confusion matrices are below, respectively:
[[23 0 0]
[ 0 21 0]
[ 0 0 22]]
[[0 0 7]
[0 7 2]
[0 0 7]]
(I know I am super overfitting, but that is a problem for later)
I am starting to think there is something wrong with the metric itself, because I think the training classification looks pretty good
Upvotes: 0
Views: 2589
Reputation: 5079
That's because tf.keras.metrics.Accuracy()
works different. It checks if predicted values match with the corresponding labels. Here is the source code if you want to check that.
Under the hood, tf.keras.metrics.Accuracy()
calculates tf.equal(y_true, y_pred)
. In your case y_pred
contains floating point numbers whereas the labels (y_true
) are one hot encoded integers.
Consider the case:
y_pred = array([0.5403488 , 0.24064924, 0.219002 ], dtype=float32)
y_true = array([1., 0., 0.], dtype=float32)
Well, if you check the result of tf.equal(y_true, y_pred)
, it will yield only False
values.
In your case, you should either use metrics = ['accuracy']
or metrics = [tf.keras.metrics.CategoricalAccuracy()]
.
Upvotes: 1