Reputation: 49
So I have 2 different true or false results that tested the same column. So test 1 has the wrong results and test 2 has the correct results. Is there python code that can compare these two results and obtain a confusion matrix result (true positives, false positives, false negatives, and true negatives)?
For example:
Test1
a True
b True
c False
d False
e True
f True
g True
Test2
a True
b True
c True
d True
e True
f True
g False
Upvotes: 0
Views: 652
Reputation: 1367
Is there python code that can compare these two results and obtain a confusion matrix result (true positives, false positives, false negatives, and true negatives)?
Assuming Test1
and Test2
are Pandas Series objects,
True positives: Test1 & Test2
False positives: Test1 & (Test2 == False)
False negatives: (Test1==False) & Test2
True negatives: (Test1==False) & (Test2==False)
To get the number of True values in a Series, use Series.count()
, For example, the number of true positives would be (Test1 & Test2).count()
.
Assuming you want the confusion matrix as a numpy array, you just fill in the cells appropriately:
confusion = np.zeros((2,2))
confusion[0,0] = (Test1 & Test2).count()
and so on...
Upvotes: 1
Reputation: 980
You can do this with numpy
I will ignore the fact that the tests have letters, and just use an array instead
#assume:
#reponses = [...list of booleans...]
#ground_truth = [...list of booleans...]
import numpy as np
responses = np.array(responses)
ground_truth = np.array(ground_truth)
true_positives = np.logical_and(responses,ground_truth)
true_negatives = np.logical_and(np.logical_not(responses),np.logical_not(ground_truth))
false_positives = np.logical_and(responses,np.logical_not(ground_truth))
false_negatives = np.logical_and(np.logical_not(responses),ground_truth)
num_true_positives = np.count_nonzero(true_positives)
num_true_negatives = np.count_nonzero(true_negatives)
num_false_positive = np.count_nonzero(false_positives)
num_false_negatives = np.count_nonzero(false_negatives)
confusion_matrix = np.array([
[num_true_positives,num_false_positives],
[num_true_negatives,num_false_negatives]
])
I'm not sure if that's the correct convention for the confusion matrix, but you can rearrange it in your own code
P.S.:
You can also use sklearn: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html
Upvotes: 1