Reputation: 85
I have survey results which I have one-hot encoded. I would like to calculate the sensitivity of each participant's response.
The below is an example of how my DataFrame is structured, whereby:
Question 1 | Chocolate | Pizza | Ice-Cream | None of the Above |
Participant ID | | | | |
1 | 1 | 1 | 1 | 0 |
2 | 0 | 0 | 1 | 0 |
3 | 1 | 0 | 1 | 0 |
I would like to append a column that contains the sum of true positives and another with the sum of false negatives, to then create another with the sensitivity score (for each participant).
The below is an example of what I am trying to do:
Question 1 | Chocolate | ... | True Positive | False Negative | ..
Participant ID | | | | |
1 | 1 | ... | 2 | 0 | ..
2 | 0 | ... | 1 | 1 | ..
3 | 1 | ... | 2 | 1 | ..
I am not sure where to begin with this! Can anyone help me out?
Thanks a lot!
Upvotes: 0
Views: 817
Reputation: 140
You could calculate the 'true pos', false neg' etc by using a confusion matrix (e.g. from Sklearn). Maybe the following code is usefull for you:
import pandas as pd
import sklearn
from sklearn.metrics import confusion_matrix
a = [[1,1,1,0], [0,0,1,0], [1,0,1,0]]
correct = [[1,0,1,0], [1,0,1,0], [1,0,1,0]]
df = pd.DataFrame(data=a)
df.columns=['chocolate', 'pizza', 'icecream', 'none']
for i in range(len(df)):
pred = a[i]
true = correct[i]
tn, fp, fn, tp = confusion_matrix(true,pred).ravel()
print (f'Nr:{i} true neg:{tn} false pos:{fp} false neg:{fn} true pos:{tp}')
The output is (which you could put in a DataFrame):
Nr:0 true neg:1 false pos:1 false neg:0 true pos:2
Nr:1 true neg:2 false pos:0 false neg:1 true pos:1
Nr:2 true neg:2 false pos:0 false neg:0 true pos:2
Upvotes: 0