Yuan.guo
Yuan.guo

Reputation: 41

How to customize the Importance Plot generated by package "randomForest"

Importance plot:

enter image description here

I want align the y-axis text to right, and also want to color the variables according to different variable group. For example Limonene and Valencane, a-Selinene and g-Selinen are in the same group,respecitvelly.

But I can not find any code for customization of plot in the package "randomForest" . Do you have a suggestion for the customization? Thank you!

Upvotes: 3

Views: 12443

Answers (1)

RLave
RLave

Reputation: 8364

Here a working example to follow:

You need to create the grouping that you want, then use ggplot with geom_bar.

set.seed(4543)
data(mtcars)

library(randomForest)
mtcars.rf <- randomForest(mpg ~ ., data=mtcars, ntree=1000, keep.forest=FALSE,
                          importance=TRUE)
imp <- varImpPlot(mtcars.rf) # let's save the varImp object

# this part just creates the data.frame for the plot part
library(dplyr)
imp <- as.data.frame(imp)
imp$varnames <- rownames(imp) # row names to column
rownames(imp) <- NULL  
imp$var_categ <- rep(1:2, 5) # random var category

# this is the plot part, be sure to use reorder with the correct measure name
library(ggplot2) 
ggplot(imp, aes(x=reorder(varnames, IncNodePurity), weight=IncNodePurity, fill=as.factor(var_categ))) + 
  geom_bar() +
  scale_fill_discrete(name="Variable Group") +
  ylab("IncNodePurity") +
  xlab("Variable Name")

You can do the same for the other importance measure, just change the plot part accordingly (weight = %IncMSE).

enter image description here

Update based on OP answer:

ggplot(imp, aes(x=reorder(varnames, IncNodePurity), y=IncNodePurity, color=as.factor(var_categ))) + 
  geom_point() +
  geom_segment(aes(x=varnames,xend=varnames,y=0,yend=IncNodePurity)) +
  scale_color_discrete(name="Variable Group") +
  ylab("IncNodePurity") +
  xlab("Variable Name") +
  coord_flip()

enter image description here

Upvotes: 8

Related Questions