Reputation: 1106
Im trying to run some ROC analysis on a multiclass knn model and dataset
so far i have this code for the kNN model. It works well.
X_train_new
is a dataset with 131 numeric variables (columns) and 7210 observations.
Y_train
is the outcome variable which i have as factor. its a dataset with only 1 column (activity) and 7210 observations (there are 6 possible factors)
ctrl <- trainControl(method = "cv",
number = 10)
model2 <- train(X_train_new,
Y_train$activity,
method = "knn",
tuneGrid = expand.grid(k = 5),
trControl = ctrl,
metric = "Accuracy"
)
X_test_new
is a dataset with 131 numeric variables (columns) and 3089 observations.
Y_test
is the outcome variable which i have as factor. its a dataset with only 1 column and 3089 observations (there are 6 possible factors)
I run the predict function
knnPredict_test <- predict(model2 , newdata = X_test_new )
I would like to do some ROC analysis on each class vs all. Im trying
a = multiclass.roc ( Y_test$activity, knnPredict_test )
knnPredict_test
is a vector with predicted classes:
knnPredict_test <- predict(model2 ,newdata = X_test_new )
> length(knnPredict_test)
[1] 3089
> glimpse(knnPredict_test)
Factor w/ 6 levels "laying","sitting",..: 2 1 5 1 3 2 4 5 3 2 ...
This is the error im getting
Error in roc.default(response, predictor, levels = X, percent = percent, :
Predictor must be numeric or ordered.
Upvotes: 2
Views: 1006
Reputation: 9087
To get the ROC, you need a numeric prediction. However, by default predict
will give you the predicted classes. Use type = "prob"
.
Here is a reproducable example which has the same error.
library(caret)
knnFit <- train(
Species ~ .,
data = iris,
method = "knn"
)
predictions_bad <- predict(knnFit)
pROC::multiclass.roc(iris$Species, predictions_bad)
#> Error in roc.default(response, predictor, levels = X, percent = percent, :
#> Predictor must be numeric or ordered.
Using type = "prob"
fixes the error.
predictions_good <- predict(knnFit, type = "prob")
pROC::multiclass.roc(iris$Species, predictions_good)
#> Call:
#> multiclass.roc.default(response = iris$Species, predictor = predictions_good)
#>
#> Data: multivariate predictor predictions_good with 3 levels of iris$Species: setosa, versicolor, virginica.
#> Multi-class area under the curve: 0.9981
Upvotes: 2