Reputation: 157
Suppose I have a confusion matrix as like as below. How can I calculate precision and recall?
Upvotes: 12
Views: 28624
Reputation: 88
import numpy as np
n_classes=3
cm = np.array([[0,1,2],
[5,4,3],
[8,7,6]])
sp = []
f1 = []
gm = []
sens = []
acc= []
for c in range(n_classes):
tp = cm[c,c]
fp = sum(cm[:,c]) - cm[c,c]
fn = sum(cm[c,:]) - cm[c,c]
tn = sum(np.delete(sum(cm)-cm[c,:],c))
recall = tp/(tp+fn)
precision = tp/(tp+fp)
accuracy = (tp+tn)/(tp+fp+fn+tn)
specificity = tn/(tn+fp)
f1_score = 2*((precision*recall)/(precision+recall))
g_mean = np.sqrt(recall * specificity)
sp.append(specificity)
f1.append(f1_score)
gm.append(g_mean)
sens.append(recall)
acc.append(tp)
print("for class {}: recall {}, specificity {}\
precision {}, f1 {}, gmean {}".format(c,round(recall,4), round(specificity,4), round(precision,4),round(f1_score,4),round(g_mean,4)))
print("sp: ", np.average(sp))
print("f1: ", np.average(f1))
print("gm: ", np.average(gm))
print("sens: ", np.average(sens))
print("accuracy: ", np.sum(acc)/np.sum(cm))
Upvotes: 0
Reputation: 580
Agreeing with gruangly and EuWern, I modified PabTorre's solution accordingly to generate precision and recall per class.
Also, given my use case (NER) where a model could:
nan
in the precision array, ornan
in the recall array...I wrap the array with a numpy.nan_to_num()
to convert any nan
to zero. This is not a mathematical decision, but a per use-case, functional decision in how to handle never-predicted, or never-occuring classes.
import numpy
confusion_matrix = numpy.array([
[ 5, 0, 0, 0, 0, 3],
[ 0, 2, 0, 1, 0, 5],
[ 0, 0, 0, 3, 5, 7],
[ 0, 0, 0, 9, 0, 0],
[ 0, 0, 0, 9, 32, 3],
[ 0, 0, 0, 0, 0, 0]
])
true_positives = numpy.diag(confusion_matrix)
false_positives = numpy.sum(confusion_matrix, axis=0) - true_positives
false_negatives = numpy.sum(confusion_matrix, axis=1) - true_positives
precision = numpy.nan_to_num(numpy.divide(true_positives, (true_positives + false_positives)))
recall = numpy.nan_to_num(numpy.divide(true_positives, (true_positives + false_negatives)))
print(true_positives) # [ 5 2 0 9 32 0 ]
print(false_positives) # [ 0 0 0 13 5 18 ]
print(false_negatives) # [ 3 6 15 0 12 0 ]
print(precision) # [1. 1. 0. 0.40909091 0.86486486 0. ]
print(recall) # [0.625 0.25 0. 1. 0.72727273 0. ]
Upvotes: 0
Reputation: 1581
hypothetical confusion matrix (cm
)
cm =
[[ 970 1 2 1 1 6 10 0 5 0]
[ 0 1105 7 3 1 6 0 3 16 0]
[ 9 14 924 19 18 3 13 12 24 4]
[ 3 10 35 875 2 34 2 14 19 19]
[ 0 3 6 0 903 0 9 5 4 32]
[ 9 6 4 28 10 751 17 5 24 9]
[ 7 2 6 0 9 13 944 1 7 0]
[ 3 11 17 3 16 3 0 975 2 34]
[ 5 38 10 16 7 28 5 4 830 20]
[ 5 3 5 13 39 10 2 34 5 853]]
precision and recall for each class using map()
to calculate list division.
from operator import truediv
import numpy as np
tp = np.diag(cm)
prec = list(map(truediv, tp, np.sum(cm, axis=0)))
rec = list(map(truediv, tp, np.sum(cm, axis=1)))
print ('Precision: {}\nRecall: {}'.format(prec, rec))
Precision: [0.959, 0.926, 0.909, 0.913, 0.896, 0.880, 0.941, 0.925, 0.886, 0.877]
Recall: [0.972, 0.968, 0.888, 0.863, 0.937, 0.870, 0.954, 0.916, 0.861, 0.880]
please note: 10 classes, 10 precisions and 10 recalls.
Upvotes: 2
Reputation: 123
For the sake of completeness for future reference, given a list of grounth (gt) and prediction (pd). The following code snippet computes confusion matrix and then calculates precision and recall.
from sklearn.metrics import confusion_matrix
gt = [1,1,2,2,1,0]
pd = [1,1,1,1,2,0]
cm = confusion_matrix(gt, pd)
#rows = gt, col = pred
#compute tp, tp_and_fn and tp_and_fp w.r.t all classes
tp_and_fn = cm.sum(1)
tp_and_fp = cm.sum(0)
tp = cm.diagonal()
precision = tp / tp_and_fp
recall = tp / tp_and_fn
Upvotes: 2
Reputation: 3107
first, your matrix is arranged upside down. You want to arrange your labels so that true positives are set on the diagonal [(0,0),(1,1),(2,2)] this is the arrangement that you're going to find with confusion matrices generated from sklearn and other packages.
Once we have things sorted in the right direction, we can take a page from this answer and say that:
\ Then we take some formulas from sklearn docs for precision and recall. And put it all into code:
import numpy as np
cm = np.array([[2,1,0], [3,4,5], [6,7,8]])
true_pos = np.diag(cm)
false_pos = np.sum(cm, axis=0) - true_pos
false_neg = np.sum(cm, axis=1) - true_pos
precision = np.sum(true_pos / (true_pos + false_pos))
recall = np.sum(true_pos / (true_pos + false_neg))
Since we remove the true positives to define false_positives/negatives only to add them back... we can simplify further by skipping a couple of steps:
true_pos = np.diag(cm)
precision = np.sum(true_pos / np.sum(cm, axis=0))
recall = np.sum(true_pos / np.sum(cm, axis=1))
Upvotes: 13