Reputation: 1819
I have a dataframe with two columns : score1
which is numeric
and truth1
which is boolean
.
I want to predict truth1
using score1
. To do that, I want a simple linear model, and then ask for a good threshold, i.e., a threshold which gives me 75% of sensitivity in my ROC curve. Hence, I do :
roc_curve = roc(truth1 ~ score1 , data = my_data)
coords(roc=roc_curve, x = 0.75, input='sensitivity', ret='threshold')
My problem is that coords return 'NA', because the sensitivty of 0.75 does not appear in the ROC curve. So here is my question: how can I get the threshold which gives me a sensitivity of at least 0.75, with max specificity?
Upvotes: 8
Views: 9025
Reputation: 486
To expand on Calimo's excellent answer, here is a generalizable code snippet:
# Specify SENSITIVITY criteria to meet.
Sn.upper <- 1.0
Sn.lower <- 0.5
# Specify SPECIFICITY criteria to meet.
Sp.upper <- 1.0
Sp.lower <- 0.6
# Extract all coordinate values from the ROC curve.
my.coords <- coords(roc=auc, x = "all", transpose = FALSE)
# Identify and print all points on the ROC curve that meet the JOINT sensitivity AND specificity criteria.
my.coords[(my.coords$specificity >= Sp.lower & my.coords$specificity <= Sp.upper &
my.coords$sensitivity >= Sn.lower & my.coords$sensitivity <= Sn.upper),]
Example output:
threshold specificity sensitivity
all.46 10.950 0.5000000 0.7073171
all.47 11.080 0.5138889 0.7073171
all.48 11.345 0.5138889 0.6829268
all.49 11.635 0.5138889 0.6585366
all.50 11.675 0.5138889 0.6341463
all.51 11.700 0.5277778 0.6341463
all.52 11.725 0.5277778 0.6097561
all.53 11.850 0.5416667 0.6097561
all.54 12.095 0.5555556 0.6097561
Upvotes: 3
Reputation: 7959
Option 1: you filter the results
my.coords <- coords(roc=roc_curve, x = "all", transpose = FALSE)
my.coords[my.coords$sensitivity >= .75, ]
Option 2: you can trick pROC
by requesting a partial AUC between 75% and 100% of sensitivity:
roc_curve = roc(truth1 ~ score1 , data = my_data, partial.auc = c(1, .75), partial.auc.focus="sensitivity")
All of pROC's methods will follow this request and give you results only in this area of interest:
coords(roc=roc_curve, x = "local maximas", ret='threshold', transpose = FALSE)
Upvotes: 15