Chelsea
Chelsea

Reputation: 335

R: Verbose Decision Tree Postscript

I'm working on a decision tree for a project. I am using R but we are allowed to use SAS if we want. There is this picture of a decision tree in the book that I think looks wonderful (as wonderful as a decision tree can look I guess): Verbose Decision Tree

Is it possible to do something like the in R? I've been using the post() function and all I have found to do so far is use pretty = 0 to keep it from abbreviating stuff. Mine is still just circles with XXXX/XXXX in each circle.

I have looked at the documentation but can't find anything that would make it more detailed. The documentation always says "... other arguments to the postscript function." but don't actually list the other arguments. I'm unsure if this decision tree is from a SAS, R, or some other rando language. The book we are using is Data Mining Statistics for Decision Making; I'm not enjoying this book. I have mostly found outside sources to explain the concepts. This is the only thing I haven't been able to figure out on my own from this book.

Upvotes: 1

Views: 500

Answers (1)

Len Greski
Len Greski

Reputation: 10855

One option for a visually attractive tree plot is fancyRpartPlot() from the rattle package. It does not exactly replicate the SAS output, but it is more visually appealing than the default plot.

For example, we'll use caret and rattle to run an rpart model on the iris data set:

library(caret)
library(rattle)
inTrain <- createDataPartition(y = iris$Species,
                               p = 0.7,
                               list = FALSE)
training <- iris[inTrain,]
testing <- iris[-inTrain,]
modFit <- train(Species ~ .,method = "rpart",data = training)
fancyRpartPlot(modFit$finalModel)

...and the output:

enter image description here

One can remove the sub heading below the chart with the sub=" " argument to fancyRpartPlot().

In contrast, the default plot is produced with the following code.

 # default plot
 plot(modFit$finalModel,uniform=TRUE, margin=.3)
 text(modFit$finalModel,use.n=TRUE,all=TRUE, cex=.9)

...and the output:

enter image description here

Another approach is the rpart.plot package as noted in the comments to the OP. It includes a large number of configurable options. To print both counts and percentages, use the extra= argument.

# use rpart.plot package
library(rpart.plot)
rpart.plot::rpart.plot(modFit$finalModel,extra=101)

...and the output:

enter image description here

Upvotes: 7

Related Questions