Reputation: 79
I build a text classifier with randomForest, so to evaluate it I try to create a ROC curve with pROC pâckage .
Here the code :
ndsi.forest <- randomForest(tf.idf[train.index, ], as.factor(train$Note.Reco[train.index]), ntree = 100)
#predict with test data
ndsi.pred <-predict(ndsi.forest, newdata = tf.idf[test.index, ], response = 'class')
pred <- data.frame(ndsi.pred)
result <- data.frame(id = Data_clean$id[test.index], sentiment = pred[ , ])
##"ROC curve"
multiclass.roc(result$sentiment, test$Note.Reco)
I was wondering if is teher a way to create the plot? ROC plot with pROC package?
I try with this code :
roc(test$Note.Reco, result$sentiment, levels = c(1,2,3,4,5,6,7,8,9,10))
But I get this error :
Error in roc.default(test$Note.Reco, result$sentiment, levels = c(1, 2, :
'levels' argument must have length 2
thank you in advance
Upvotes: 1
Views: 3017
Reputation: 1560
As far as I have understood, you have a multiclass response variable (corresponding to 10 different groups).
The ROC - curve is defined for the classification of two groups, so what multiclass makes is to compute the classification for "one group against the rest". multiclass.roc
function doesn't allow you to represent the curves, but understanding what it does, you can:
1) Consider as many roc curves as groups you have. That is, the ROC - curve for the classification of:
You can do that with the roc
function. The only thing you need is to redefine the response vector with 1 for the individuals belonging to the group i and a 0 for the rest of individuals. Save each roc
object with a differnt name.
2) To represent all the curves, just use plot
function for each of the curves adding plot(..., add=T)
to all of them but the first.
Upvotes: 1