Reputation: 35
I'm working on multi-class classification in python (4 classes). To obtain the results of each class separately, I used the following code:
from sklearn.metrics import confusion_matrix
cm = confusion_matrix(y_test, y_pred)
cnf_matrix = cm
FP = cnf_matrix.sum(axis=0) - np.diag(cnf_matrix)
FN = cnf_matrix.sum(axis=1) - np.diag(cnf_matrix)
TP = np.diag(cnf_matrix)
TN = cnf_matrix.sum() - (FP + FN + TP)
FP = FP.astype(float)
FN = FN.astype(float)
TP = TP.astype(float)
TN = TN.astype(float)
# Sensitivity, hit rate, recall, or true positive rate
TPR = TP/(TP+FN)
print('TPR : ',TPR)
# Specificity or true negative rate
TNR = TN/(TN+FP)
print('TNR : ',TNR)
# Precision or positive predictive value
PPV = TP/(TP+FP)
print('PPV : ',PPV)
# Fall out or false positive rate
FPR = FP/(FP+TN)
print('FPR : ',FPR)
# False negative rate
FNR = FN/(TP+FN)
print('FNR : ',FNR)
# Overall accuracy
ACC = (TP+TN)/(TP+FP+FN+TN)
print('ACC : ',ACC)
I obtained the following results:
TPR : [0.98398792 0.99999366 0.99905393 0.99999548]
TNR : [0.99999211 0.99997989 1. 0.99773928]
PPV : [0.99988488 0.99996832 1. 0.99810887]
FPR : [7.89469529e-06 2.01061605e-05 0.00000000e+00 2.26072224e-03]
FNR : [1.60120846e-02 6.33705530e-06 9.46073794e-04 4.52196090e-06]
ACC : [0.99894952 0.99998524 0.99999754 0.99896674]
Now, I want to calculate the average value of each metrics ?! Should I just add the four values to each others, after that divide the results on 4 ? for example, for the accuracy (ACC) : (0.99894952 + 0.99998524 + 0.99999754 + 0.99896674)/4 ?!! Or What should I do exactly ? Help please.
Upvotes: 3
Views: 1618
Reputation: 2104
Accuracy is total correct predictions divided by total number of predictions. Now lets say you have a dataset with 45 entries in test set with 4 classes.
class 1: 10 rows
class 2: 10 rows
class 3: 10 rows
class 4: 15 rows
Now per class accuracy is
class 1: 1 (10/10)
class 2: 1 (10/10)
class 3: 1 (10/10)
class 4: 0.33 (5/15)
Now if you sum all the accuracy and divide it by 4, i.e. your approach, the answer will be 0.83
.
If you sum the total number of correct predictions, that is 35 out of 45, the accuracy is 35/45 = 0.77
So both methods are not same. The method of taking average of accuracy, i.e. what you are doing will only work if all classes are balanced otherwise its the wrong method.
You should calculate the total number of correct predictions and divide it by total number of predictions i.e. correct / (correct+wrong)
Upvotes: 1