Reputation: 156
I am using balanced_accuracy_score and accuracy_score both in sklearn.metrics.
According to documentation, those two metrics are the same but in my code, the first is giving me 96% and the second one is 97% while accuracy from training is 98%
Can you explain to me what is the difference between the three accuracies and how each is computed?
Note: the problem is a multi-classification problem with three classes.
I have attached code samples.
accuracy is 98%
model.compile(loss='categorical_crossentropy',
optimizer=Adam(lr=0.00001),
metrics=['accuracy'])
accuracy is 96%
from sklearn.metrics import balanced_accuracy_score
balanced_accuracy_score(all_labels, all_predications)
accuracy is 97%
from sklearn.metrics import accuracy_score
accuracy_score(all_labels, all_predications)
Upvotes: 7
Views: 12973
Reputation: 571
Accuracy = tp+tn/(tp+tn+fp+fn) doesn't work well for unbalanced classes.
Therefore we can use Balanced Accuracy = TPR+TNR/2
TPR= true positive rate = tp/(tp+fn) : also called 'sensitivity'
TNR = true negative rate= tn/(tn+fp) : also caled 'specificity'
Balanced Accuracy gives almost the same results as ROC AUC Score.
Links:
1 https://en.wikipedia.org/wiki/Precision_and_recall
3 https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_auc_score.html
Upvotes: 8
Reputation: 1013
As far as I understand the problem (without knowing what all_labels, all_predictions) is run on, the difference in your out of sample predictions between balanced_accuracy_score
and accuracy_score
is caused by the balancing of the former function.
accuracy_score
simply returns the percentage of labels you predicted correctly (i.e. there are 1000 labels, you predicted 980 accurately, i.e. you get a score of 98%.
balanced_accuracy_score
however works differently in that it returns the average accuracy per class, which is a different metric. Say your 1000 labels are from 2 classes with 750 observations in class 1 and 250 in class 2. If you miss-predict 10 in each class, you have an accuracy of 740/750= 98.7% in class 1 and 240/250=96% in class 2. balanced_accuracy_score
would then return (98.7%+96%)/2 = 97.35%. So I believe the program to work as expected, based on the documentation.
Upvotes: 16