Reputation: 35
I'm trying to find the precision and recall for the confusion matrix given below, but an error occurred. How would I accomplish it using Numpy and Sklearn?
array([[748, 0, 4, 5, 1, 16, 9, 4, 8, 0],
[ 0, 869, 6, 5, 2, 2, 2, 5, 12, 3],
[ 6, 19, 642, 33, 13, 7, 16, 15, 31, 6],
[ 5, 3, 30, 679, 2, 44, 1, 12, 23, 12],
[ 4, 7, 9, 2, 704, 5, 10, 8, 7, 43],
[ 5, 6, 10, 39, 11, 566, 18, 4, 33, 10],
[ 6, 5, 17, 2, 5, 12, 737, 2, 9, 3],
[ 5, 7, 8, 18, 14, 2, 0, 752, 5, 42],
[ 7, 15, 34, 28, 12, 29, 6, 4, 600, 18],
[ 4, 6, 6, 16, 21, 4, 0, 50, 8, 680]], dtype=int64)
Upvotes: 0
Views: 1758
Reputation: 19312
As already recommended by someone else, you can directly calulate over y_actual
and y_predictedusing
sklearn.metricswith
precision_scoreand
recall_score`to calculate what you need. Read more here for precision and recall scores.
But, IIUC, you are looking to do the same, directly, with a confusion matrix. Here is how you calculate precision and recall using the confusion matrix directly.
NOTE: There are 2 types of precision and recall that are generally calculated -
- Micro precision: All TP across all classes summed and divided by the TP+FP
- Macro precision: Calculate TP/TP+FP for each class separately, and then take an average (ignorning nans)
- You can find more details on types of precision (and recall) here.
I show both the methods for your understanding below -
import numpy as np
from sklearn.metrics import confusion_matrix, precision_score, recall_score
####################################################
#####Using SKLEARN API on TRUE & PRED Labels########
####################################################
y_true = [0, 1, 2, 2, 1, 1]
y_pred = [0, 2, 2, 2, 1, 2]
confusion_matrix(y_true, y_pred)
precision_micro = precision_score(y_true, y_pred, average="micro")
precision_macro = precision_score(y_true, y_pred, average="macro")
recall_micro = recall_score(y_true, y_pred, average='micro')
recall_macro = recall_score(y_true, y_pred, average="macro")
print("Sklearn API")
print("precision_micro:", precision_micro)
print("precision_macro:", precision_macro)
print("recall_micro:", recall_micro)
print("recall_macro:", recall_macro)
####################################################
####Calculating directly from confusion matrix######
####################################################
cf = confusion_matrix(y_true, y_pred)
TP = cf.diagonal()
precision_micro = TP.sum()/cf.sum()
recall_micro = TP.sum()/cf.sum()
#NOTE: The sum of row-wise sums of a matrix = sum of column-wise sums of a matrix = sum of all elements of a matrix
#Therefore, the micro-precision and micro-recall is mathematically the same for a multi-class problem.
precision_macro = np.nanmean(TP/cf.sum(0))
recall_macro = np.nanmean(TP/cf.sum(1))
print("")
print("Calculated:")
print("precision_micro:", precision_micro)
print("precision_macro:", precision_macro)
print("recall_micro:", recall_micro)
print("recall_macro:", recall_macro)
Sklearn API
precision_micro: 0.6666666666666666
precision_macro: 0.8333333333333334
recall_micro: 0.6666666666666666
recall_macro: 0.7777777777777777
Calculated:
precision_micro: 0.6666666666666666
precision_macro: 0.8333333333333334
recall_micro: 0.6666666666666666
recall_macro: 0.7777777777777777
cf = [[748, 0, 4, 5, 1, 16, 9, 4, 8, 0],
[ 0, 869, 6, 5, 2, 2, 2, 5, 12, 3],
[ 6, 19, 642, 33, 13, 7, 16, 15, 31, 6],
[ 5, 3, 30, 679, 2, 44, 1, 12, 23, 12],
[ 4, 7, 9, 2, 704, 5, 10, 8, 7, 43],
[ 5, 6, 10, 39, 11, 566, 18, 4, 33, 10],
[ 6, 5, 17, 2, 5, 12, 737, 2, 9, 3],
[ 5, 7, 8, 18, 14, 2, 0, 752, 5, 42],
[ 7, 15, 34, 28, 12, 29, 6, 4, 600, 18],
[ 4, 6, 6, 16, 21, 4, 0, 50, 8, 680]]
cf = np.array(cf)
TP = cf.diagonal()
precision_micro = TP.sum()/cf.sum()
recall_micro = TP.sum()/cf.sum()
precision_macro = np.nanmean(TP/cf.sum(0))
recall_macro = np.nanmean(TP/cf.sum(1))
print("Calculated:")
print("precision_micro:", precision_micro)
print("precision_macro:", precision_macro)
print("recall_micro:", recall_micro)
print("recall_macro:", recall_macro)
Calculated:
precision_micro: 0.872125
precision_macro: 0.8702549015235986
recall_micro: 0.872125
recall_macro: 0.8696681555022805
Upvotes: 3
Reputation: 567
You can use scikit-learn
to calculate recall
and precision
of each class.
Example:
from sklearn.metrics import classification_report
y_true = [0, 1, 2, 2, 2]
y_pred = [0, 0, 2, 2, 1]
target_names = ['class 0', 'class 1', 'class 2']
print(classification_report(y_true, y_pred, target_names=target_names))
precision recall f1-score support
class 0 0.50 1.00 0.67 1
class 1 0.00 0.00 0.00 1
class 2 1.00 0.67 0.80 3
accuracy 0.60 5
macro avg 0.50 0.56 0.49 5
weighted avg 0.70 0.60 0.61 5
Reference here
Upvotes: 2