MiksL
MiksL

Reputation: 53

Weird ROC curve

I am having weird results of ROC curve when building a random forest model for highly imbalanced two-class prediction. Original event rate in the sample is ~2% and I am using weighting to fight class imbalance.
In this case I have weighted my sample so that event rate is 1:4 (25%)
My model is set-up in a following way:

forest <- ranger(data = sample[,c('fraud', features)]
                 , num.trees = 350
                 , case.weights = sample$wt
                 , probability = T
                 , importance = 'impurity'
                 , write.forest = T
                 , sample.fraction = 0.5
                 , seed = 98
                 , dependent.variable.name = 'fraud')

I am getting pretty good results with this set-up as you can see in the confusion matrix below

    predicted
true      0      1
   0 815800  11391
   1  13283   5503
True negative rate - 29%
Negative predictive value - 33%

However when I'm drawing ROC curve I get following plot

perf <- prediction(forest$predictions[,2], sample$fraud)
pred3 <- performance(perf, "tnr", "fnr")
plot(pred3, main="ROC Curve for Random Forest", col="blue", lwd=2)
abline(a=0,b=1,lwd=2,lty=2,col="gray")

enter image description here

I can't understand why my prediction is starting to perform only after 50% of decision interval. Do you guys have a clue or any previous experience?

Upvotes: 1

Views: 826

Answers (1)

RomRom
RomRom

Reputation: 322

We normally plot the True positive rate and False positive rate in ROC curve... but you have the TRUE negative and false negative. Maybe that's why.

Upvotes: 1

Related Questions