Reputation: 3272
I am running a neural network with different activation functions to see their effect on learning. I am using the MNIST dataset and have two hidden layers. I am getting the following learning curves for accuracies and errors.
From the accuracy curve it is obvious that the sigmoid performs the worst. But when you look at the error plot, it seems to have a final error significantly lower than the others. It has low accuracy and low error? I don't understand how this can is possible. Can someone please explain what's going on here? Is this possible or am I making some mistake?
Upvotes: 1
Views: 990
Reputation: 311
Firstly, it would be easier to interpret the plots if you provided some more information on how you obtained them. Are they both computed on the same dataset? I'm also assuming you're using a softmax function at the last dense layer and optimizing a cross-entropy loss function.
loss_i = - log p_i
p_i is the softmax probability of the correct class that i-th image belongs to. (The model outputs a probability for each of the 10 classes but the cross_entropy loss function only uses the one that's predicted for the correct class). The loss function is averaged over all images in the data.
This is what I see from the two plots: The first plot shows that the sigmoid model misclassifies more images than the ReLU-related models --hence it scores a lower accuracy. However, the second plot show that, on the average, when it classifies an image correctly it scores a higher probability value (closer to 100%), and/or when it misclassifies an image it seems to be wrong only by a small amount.
The ReLU-related models seem to be better in predicting the correct class. However, when they're wrong, they seem to miss it horribly.
Why would that be? How is this related to the hidden layer activation functions?... I can't tell.
Upvotes: 2
Reputation: 1829
There are few things that you should note,
Therefore, If you want to compare two or more DNN models it's better to use the accuracy of each model than using the loss.
Moreover,
Therefore, there is a basic difference between the calculation of loss and accuracy as well as their usages.
Upvotes: -1