Reputation: 335
I'm working on a decision tree for a project. I am using R but we are allowed to use SAS if we want. There is this picture of a decision tree in the book that I think looks wonderful (as wonderful as a decision tree can look I guess):
Is it possible to do something like the in R? I've been using the post() function and all I have found to do so far is use pretty = 0
to keep it from abbreviating stuff. Mine is still just circles with XXXX/XXXX in each circle.
I have looked at the documentation but can't find anything that would make it more detailed. The documentation always says "... other arguments to the postscript function." but don't actually list the other arguments. I'm unsure if this decision tree is from a SAS, R, or some other rando language. The book we are using is Data Mining Statistics for Decision Making; I'm not enjoying this book. I have mostly found outside sources to explain the concepts. This is the only thing I haven't been able to figure out on my own from this book.
Upvotes: 1
Views: 500
Reputation: 10855
One option for a visually attractive tree plot is fancyRpartPlot()
from the rattle
package. It does not exactly replicate the SAS output, but it is more visually appealing than the default plot.
For example, we'll use caret
and rattle
to run an rpart model on the iris data set:
library(caret)
library(rattle)
inTrain <- createDataPartition(y = iris$Species,
p = 0.7,
list = FALSE)
training <- iris[inTrain,]
testing <- iris[-inTrain,]
modFit <- train(Species ~ .,method = "rpart",data = training)
fancyRpartPlot(modFit$finalModel)
...and the output:
One can remove the sub heading below the chart with the sub=" "
argument to fancyRpartPlot()
.
In contrast, the default plot is produced with the following code.
# default plot
plot(modFit$finalModel,uniform=TRUE, margin=.3)
text(modFit$finalModel,use.n=TRUE,all=TRUE, cex=.9)
...and the output:
Another approach is the rpart.plot package as noted in the comments to the OP. It includes a large number of configurable options. To print both counts and percentages, use the extra=
argument.
# use rpart.plot package
library(rpart.plot)
rpart.plot::rpart.plot(modFit$finalModel,extra=101)
...and the output:
Upvotes: 7