asdf123
asdf123

Reputation: 49

How to compare two different true or false columns and get a confusion matrix? Python

So I have 2 different true or false results that tested the same column. So test 1 has the wrong results and test 2 has the correct results. Is there python code that can compare these two results and obtain a confusion matrix result (true positives, false positives, false negatives, and true negatives)?

For example:

Test1
a  True
b  True
c  False
d  False
e  True
f  True
g  True



Test2
a  True
b  True
c  True
d  True
e  True
f  True
g  False

Upvotes: 0

Views: 652

Answers (2)

The Photon
The Photon

Reputation: 1367

Is there python code that can compare these two results and obtain a confusion matrix result (true positives, false positives, false negatives, and true negatives)?

Assuming Test1 and Test2 are Pandas Series objects,

True positives: Test1 & Test2

False positives: Test1 & (Test2 == False)

False negatives: (Test1==False) & Test2

True negatives: (Test1==False) & (Test2==False)

To get the number of True values in a Series, use Series.count(), For example, the number of true positives would be (Test1 & Test2).count().

Assuming you want the confusion matrix as a numpy array, you just fill in the cells appropriately:

confusion = np.zeros((2,2))
confusion[0,0] = (Test1 & Test2).count()

and so on...

Upvotes: 1

Michael Sohnen
Michael Sohnen

Reputation: 980

You can do this with numpy

I will ignore the fact that the tests have letters, and just use an array instead

#assume: 
#reponses = [...list of booleans...]
#ground_truth = [...list of booleans...]

import numpy as np
responses = np.array(responses)
ground_truth = np.array(ground_truth)

true_positives = np.logical_and(responses,ground_truth)
true_negatives = np.logical_and(np.logical_not(responses),np.logical_not(ground_truth))
false_positives = np.logical_and(responses,np.logical_not(ground_truth))
false_negatives = np.logical_and(np.logical_not(responses),ground_truth)

num_true_positives = np.count_nonzero(true_positives)
num_true_negatives = np.count_nonzero(true_negatives)
num_false_positive = np.count_nonzero(false_positives)
num_false_negatives = np.count_nonzero(false_negatives)

confusion_matrix = np.array([
    [num_true_positives,num_false_positives],
    [num_true_negatives,num_false_negatives]
])

I'm not sure if that's the correct convention for the confusion matrix, but you can rearrange it in your own code

P.S.:

You can also use sklearn: https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html

Upvotes: 1

Related Questions