staove7
staove7

Reputation: 580

Trouble using VarImpPlot in randomForest R

I built a RandomForest model using the following code:

library(randomForest)

set.seed(101)
RFs1 = ERC[sample(nrow(ERC),100000),]
RFs2  <-  RFs1[,-c(1,2,3,228,229,230,232,233,234,235,240)] 
RFs2 <- RFs2[complete.cases(RFs2),] # handling missing values

RFfit <- randomForest(as.factor(RFs2$earlyR)~., data=RFs2[,-231])

VI_F <- importance(RFfit)
varImpPlot(VI_F, type = 2) 

Now, When I'm trying to plot Feature Importance I'm getting the following error:

Error in varImpPlot(VI_F, type = 2) : This function only works for objects of class `randomForest'

I looked for a solution for the problem here (Stack Overflow) and over the net, buy I couldn't find one.

Any help will be appreciated!

Upvotes: 1

Views: 6638

Answers (1)

Nick Criswell
Nick Criswell

Reputation: 1743

There are two issues with the code which I'll try to explain. I will do this with mtcars since you did not provide sample data. First, you need to pass importance = TRUE in your call to randomForest.

mtrf <- randomForest(mpg ~ . , data = mtcars, importance = TRUE)

You can get the importance as a table with

importance(mtrf)

> importance(mtrf)
       %IncMSE IncNodePurity
cyl  11.584480    194.396219
disp 12.560117    230.427777
hp   12.908195    201.095073
drat  5.238172     69.766801
wt   12.449930    233.921376
qsec  3.705991     27.621441
vs    4.221830     27.044382
am    1.982329      9.416001
gear  3.472656     18.282543
carb  6.116177     28.398651

However, to do the plot, you need to call varImpPlot on the actual randomForest object you created, using the importance = TRUE argument.

varImpPlot(mtrf)

enter image description here

I would recommend Introduction to Statistical Learning with Applications in R as a good intro to using the randomForest package in R.

Upvotes: 2

Related Questions