Reputation:
I am having a hard time translating the example command for an ROC to my dataset. This is for the pROC package
This is the example using data (aSAH)
roc(aSAH$outcome, aSAH$s100b)
roc(outcome ~ s100b, aSAH)
So... aSAH should be replaced with the name of my data set or data subset. Correct?
outcome should be replaced by my outcome variable name. Correct?
s100b should be replaced with my predictor variable name. Correct? What if I do not have a single predictor variable but instead I am trying to determine the ROC of a tree? I did try to replace s100b with the name of my tree but that didn't work either.
Upvotes: 3
Views: 936
Reputation: 717
In the roc command in R, the first argument is the real observed response and the second the scores of your model. In order to draw the ROC-curve, it's easiest to apply the roc curve and store the results in some other variable - let's call it analysis. Then, one needs to extract the sensitivity and 1-specificity from the variable analysis because that's what you need for the ROC-curve. This can be done in the plot command:
plot(1-analysis$specificities,analysis$sensitivities,type="l")
Please have look at the picture and how an outcome could look like in R. Below the picture, you can find the R-code for this curve and apply it to your problem. Please note, at the beginning I simulated data.
rm(list = ls()) # clear work space
##Simulate Data
set.seed(123456)
n <- 10000
q <- 0.8
#Simulate predictions
Real <- c(sample(c(0,1), n/2, replace = TRUE, prob = c(1-q,q)),
sample(c(0,1), n/2, replace = TRUE, prob = c(0.7,0.3)))
#Simulate Response
p <- c(rep(seq(0.4,0.9, length=100), 50),
rep(seq(0.2,0.6, length=100), 50))
p <- data.frame(cbind(Real, p))
#install and load package
install.packages("pROC")
library(pROC)
#apply roc function
analysis <- roc(response=p$Real, predictor=p$p)
#Plot ROC Curve
plot(1-analysis$specificities,analysis$sensitivities,type="l",
ylab="Sensitiviy",xlab="1-Specificity",col="black",lwd=2,
main = "ROC Curve for Simulated Data")
abline(a=0,b=1)
abline(v = opt_t) #add optimal t to ROC curve
Upvotes: 1