dr.nasri84
dr.nasri84

Reputation: 79

Plot ROC curve with pROC R

I build a text classifier with randomForest, so to evaluate it I try to create a ROC curve with pROC pâckage .

Here the code :

ndsi.forest <- randomForest(tf.idf[train.index, ], as.factor(train$Note.Reco[train.index]), ntree = 100)

#predict with test data
ndsi.pred <-predict(ndsi.forest, newdata = tf.idf[test.index, ], response  = 'class')
pred <- data.frame(ndsi.pred)
result <- data.frame(id = Data_clean$id[test.index], sentiment = pred[ , ])

##"ROC curve"
multiclass.roc(result$sentiment, test$Note.Reco)

I was wondering if is teher a way to create the plot? ROC plot with pROC package?

I try with this code :

roc(test$Note.Reco, result$sentiment, levels = c(1,2,3,4,5,6,7,8,9,10))

But I get this error :

Error in roc.default(test$Note.Reco, result$sentiment, levels = c(1, 2,  : 
  'levels' argument must have length 2

thank you in advance

Upvotes: 1

Views: 3017

Answers (1)

R18
R18

Reputation: 1560

As far as I have understood, you have a multiclass response variable (corresponding to 10 different groups).

The ROC - curve is defined for the classification of two groups, so what multiclass makes is to compute the classification for "one group against the rest". multiclass.roc function doesn't allow you to represent the curves, but understanding what it does, you can:

1) Consider as many roc curves as groups you have. That is, the ROC - curve for the classification of:

  • Group 1 vs Not Group 1
  • Group 2 vs Not Group 2
  • . . .
  • Group 10 vs Not Group 10

You can do that with the roc function. The only thing you need is to redefine the response vector with 1 for the individuals belonging to the group i and a 0 for the rest of individuals. Save each roc object with a differnt name.

2) To represent all the curves, just use plot function for each of the curves adding plot(..., add=T) to all of them but the first.

Upvotes: 1

Related Questions