Reputation: 25
I'm having trouble in interpreting the output of Keras model.fit()
method.
The setting
print(tf.version.VERSION) # 2.3.0
print(keras.__version__) # 2.4.0
I have a simple feedforward model for a 3-class classification problem:
def get_baseline_mlp(signal_length):
input_tensor = keras.layers.Input(signal_length, name="input")
dense_1 = keras.layers.Flatten()(input_tensor)
dense_1 = keras.layers.Dense(name='dense_1',activation='relu',units=500)(dense_1)
dense_1 = keras.layers.Dense(name='dense_2',activation='relu',units=500)(dense_1)
dense_1 = keras.layers.Dense(name='dense_3',activation='relu',units=500)(dense_1)
dense_1 = keras.layers.Dense(name='dense_4',activation='softmax',units=3, bias_initializer='zero')(dense_1)
model = keras.models.Model(inputs=input_tensor, outputs=[dense_1])
model.summary()
return model
My training data are univariate timeseries, and my output is a one-hot encoded vector of length 3 (I have 3 classes in my classification problem)
Model is compiled as following:
mlp_base.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics=['categorical_accuracy'])
I have a function to manually calculate accuracy of my prediction with two methods:
def get_accuracy(model, true_x, true_y):
res = model.predict(true_x)
res = np.rint(res)
right = 0
for i in range(len(true_y[:, 0])):
if np.array_equal(res[i, :], true_y[i, :]):
#print(res[i,:], tr_y[i,:])
right += 1
else:
pass
tot = len(true_y[:,0])
print('True - total', right, tot)
print('acc: {}'.format((right/tot)))
print()
print(' method 2 - categorical')
res = model.predict(true_x)
res = np.argmax(res, axis=-1)
true_y = np.argmax(true_y, axis=-1)
right = 0
for i in range(len(true_y)):
if res[i] == true_y[i]:
right += 1
else:
pass
tot = len(true_y)
print('True - total', right, tot)
print('acc: {}'.format((right/tot)))
The Problem
At training end, the outputted categorical accuracy does not match the one I get using my custom function.
Training output:
Model: "functional_17"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
input (InputLayer) [(None, 9000)] 0
_________________________________________________________________
flatten_8 (Flatten) (None, 9000) 0
_________________________________________________________________
dense_1 (Dense) (None, 500) 4500500
_________________________________________________________________
dense_2 (Dense) (None, 500) 250500
_________________________________________________________________
dense_3 (Dense) (None, 500) 250500
_________________________________________________________________
dense_4 (Dense) (None, 3) 1503
=================================================================
Total params: 5,003,003
Trainable params: 5,003,003
Non-trainable params: 0
-------------------------------------------------------------------
Fit model on training data
Epoch 1/2
20/20 [==] - 0s 14ms/step - loss: 1.3796 categorical_accuracy: 0.3250 - val_loss: 0.9240 -
Epoch 2/2
20/20 [==] - 0s 8ms/step - loss: 0.8131 categorical_accuracy: 0.6100 - val_loss: 1.2811
Output of accuracy function:
True / total 169 200
acc: 0.845
method 2
True / total 182 200
acc: 0.91
Why am I getting wrong results? Is my accuracy implementation wrong?
Correcting the settings as desertnaut suggested is still not working.
Output of fit:
Epoch 1/3
105/105 [===] - 1s 9ms/step - loss: 1.7666 - categorical_accuracy: 0.2980
Epoch 2/3
105/105 [===] - 1s 6ms/step - loss: 1.2380 - categorical_accuracy: 0.4432
Epoch 3/3
105/105 [===] - 1s 5ms/step - loss: 1.0318 - categorical_accuracy: 0.5989
If I use the categorical accuracy function by keras I'm still getting different results.
cat_acc = keras.metrics.CategoricalAccuracy()
cat_acc.update_state(tr_y2, y_pred)
print(cat_acc.result().numpy()) # outputs : 0.7211079
Interestingly, if I compute with the above methods the validation accuracy I get consistent output.
Upvotes: 1
Views: 900
Reputation: 60321
Not quite sure about your accuracy calculation (seems unnecessary convoluted, and we always prefer vector calculations over for
loops), but there are two issues with your code that may impact the results (or even render them meaningless).
The first issue is that, since your are in a multiclass setting, you should compile your model with loss='categorical_crossentropy'
, and not 'binary_crossentropy'
; check own answer in Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? to see what may happen when you mix losses & accuracies that way (plus, a 'binary_accuracy'
here is absolutely meaningless).
The second issue is that you erroneously use activation='sigmoid'
for your last layer: since you are in a multi-class (not multi-label) setting with your labels one-hot encoded, the activation in your last layer should be softmax
, and not sigmoid
.
Upvotes: 1