Mick Hardins
Mick Hardins

Reputation: 25

Calculating categorical accuracy manually doesn't match the one given by keras

I'm having trouble in interpreting the output of Keras model.fit() method.

The setting

print(tf.version.VERSION) # 2.3.0
print(keras.__version__) # 2.4.0

I have a simple feedforward model for a 3-class classification problem:

def get_baseline_mlp(signal_length):
    input_tensor = keras.layers.Input(signal_length, name="input")
    dense_1 = keras.layers.Flatten()(input_tensor)

    dense_1 = keras.layers.Dense(name='dense_1',activation='relu',units=500)(dense_1)
    dense_1 = keras.layers.Dense(name='dense_2',activation='relu',units=500)(dense_1)
    dense_1 = keras.layers.Dense(name='dense_3',activation='relu',units=500)(dense_1)
    dense_1 = keras.layers.Dense(name='dense_4',activation='softmax',units=3, bias_initializer='zero')(dense_1)

    model = keras.models.Model(inputs=input_tensor, outputs=[dense_1])
    model.summary()
    return model

My training data are univariate timeseries, and my output is a one-hot encoded vector of length 3 (I have 3 classes in my classification problem)

Model is compiled as following:

mlp_base.compile(optimizer=optimizer, 
                           loss='categorical_crossentropy',
                           metrics=['categorical_accuracy'])

I have a function to manually calculate accuracy of my prediction with two methods:

def get_accuracy(model, true_x, true_y): 
    res = model.predict(true_x)
    res = np.rint(res)
    right = 0
    for i in range(len(true_y[:, 0])):
        if np.array_equal(res[i, :], true_y[i, :]):
            #print(res[i,:], tr_y[i,:])
            right += 1
        else:
            pass
    tot = len(true_y[:,0])
    print('True - total', right, tot)
    print('acc: {}'.format((right/tot)))
    print()
    print(' method 2 - categorical')
    res = model.predict(true_x)
    res = np.argmax(res, axis=-1)
    true_y = np.argmax(true_y, axis=-1)
    right = 0
    for i in range(len(true_y)):
        if res[i] == true_y[i]:
            right += 1
        else:
            pass
    tot = len(true_y)
    print('True - total', right, tot)
    print('acc: {}'.format((right/tot)))

The Problem

At training end, the outputted categorical accuracy does not match the one I get using my custom function.

Training output:

Model: "functional_17"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input (InputLayer)           [(None, 9000)]            0         
_________________________________________________________________
flatten_8 (Flatten)          (None, 9000)              0         
_________________________________________________________________
dense_1 (Dense)              (None, 500)               4500500   
_________________________________________________________________
dense_2 (Dense)              (None, 500)               250500    
_________________________________________________________________
dense_3 (Dense)              (None, 500)               250500    
_________________________________________________________________
dense_4 (Dense)              (None, 3)                 1503      
=================================================================
Total params: 5,003,003
Trainable params: 5,003,003
Non-trainable params: 0
-------------------------------------------------------------------
Fit model on training data
Epoch 1/2
20/20 [==] - 0s 14ms/step - loss: 1.3796 categorical_accuracy: 0.3250 - val_loss: 0.9240 - 
Epoch 2/2
20/20 [==] - 0s 8ms/step - loss: 0.8131 categorical_accuracy: 0.6100 - val_loss: 1.2811

Output of accuracy function:

True / total 169 200
acc: 0.845

 method 2
True / total 182 200
acc: 0.91

Why am I getting wrong results? Is my accuracy implementation wrong?

Update

Correcting the settings as desertnaut suggested is still not working.

Output of fit:

Epoch 1/3
105/105 [===] - 1s 9ms/step - loss: 1.7666 - categorical_accuracy: 0.2980
Epoch 2/3
105/105 [===] - 1s 6ms/step - loss: 1.2380 - categorical_accuracy: 0.4432
Epoch 3/3
105/105 [===] - 1s 5ms/step - loss: 1.0318 - categorical_accuracy: 0.5989

If I use the categorical accuracy function by keras I'm still getting different results.

cat_acc =  keras.metrics.CategoricalAccuracy()
cat_acc.update_state(tr_y2, y_pred)
print(cat_acc.result().numpy()) # outputs : 0.7211079

Interestingly, if I compute with the above methods the validation accuracy I get consistent output.

Upvotes: 1

Views: 900

Answers (1)

desertnaut
desertnaut

Reputation: 60321

Not quite sure about your accuracy calculation (seems unnecessary convoluted, and we always prefer vector calculations over for loops), but there are two issues with your code that may impact the results (or even render them meaningless).

The first issue is that, since your are in a multiclass setting, you should compile your model with loss='categorical_crossentropy', and not 'binary_crossentropy'; check own answer in Why binary_crossentropy and categorical_crossentropy give different performances for the same problem? to see what may happen when you mix losses & accuracies that way (plus, a 'binary_accuracy' here is absolutely meaningless).

The second issue is that you erroneously use activation='sigmoid' for your last layer: since you are in a multi-class (not multi-label) setting with your labels one-hot encoded, the activation in your last layer should be softmax, and not sigmoid.

Upvotes: 1

Related Questions