Reputation: 151
Here's a reprex
library(caret)
library(dplyr)
set.seed(88, sample.kind = "Rounding")
mtcars <- mtcars %>%
mutate(am = as.factor(am))
test_index <- createDataPartition(mtcars$am, times = 1, p= 0.2, list = F)
train_cars <- mtcars[-test_index,]
test_cars <- mtcars[test_index,]
set.seed(88, sample.kind = "Rounding")
cars_nb <- train(am ~ mpg + cyl,
data = train_cars, method = "nb",
trControl = trainControl(method = "cv", number = 10, savePredictions = "final"))
cars_glm <- train(am ~ mpg + cyl,
data = train_cars, method = "glm",
trControl = trainControl(method = "cv", number = 10, savePredictions = "final"))
My question is, how would I go about creating an AUC ROC curve on a single plot to visually compare the two models?
Upvotes: 3
Views: 1505
Reputation: 7950
I assume that you want to show the ROC curves on the test set, unlike in the question pointed in the comment (ROC curve from training data in caret) which uses the training data.
The first thing to do will be to extract predictions on the test data (newdata=test_cars
), in the form of probabilities (type="prob"
):
predictions_nb <- predict(cars_nb, newdata=test_cars, type="prob")
predictions_glm <- predict(cars_glm, newdata=test_cars, type="prob")
This gives us a data.frame with probabilities to belong to class 0 or 1. Let's use the probability of class 1 only:
predictions_nb <- predict(cars_nb, newdata=test_cars, type="prob")[,"1"]
predictions_glm <- predict(cars_glm, newdata=test_cars, type="prob")[,"1"]
Next I'll use the pROC package to create the ROC curves for the training data (disclaimer: I am the author of this package. There are other ways to achieve the result, but this is the one I am the most familiar with):
library(pROC)
roc_nb <- roc(test_cars$am, predictions_nb)
roc_glm <- roc(test_cars$am, predictions_glm)
Finally you can plot the curves. To have two curves with the pROC package, use the lines
function to add the line of the second ROC curve to the plot
plot(roc_nb, col="green")
lines(roc_glm, col="blue")
To make it more readable you can add a legend:
legend("bottomright", col=c("green", "blue"), legend=c("NB", "GLM"), lty=1)
And with the AUC:
legend_nb <- sprintf("NB (AUC: %.2f)", auc(roc_nb))
legend_glm <- sprintf("GLM (AUC: %.2f)", auc(roc_glm))
legend("bottomright",
col=c("green", "blue"), lty=1,
legend=c(legend_nb, legend_glm))
Upvotes: 7