Reputation: 119
I'm running a ctree method model in caret and trying to plot the decision tree I get. This is the main portion of my code.
fitControl <- trainControl(method = "cv", number = 10)
dtree <- train(
Outcome ~ ., data = training_set,
method = "ctree", trControl = fitControl
)
I'm trying to plot the decision tree and I use
plot(dtree$finalModel)
which gives me this -
The picture's not good here but I get an image similar to the first plot in the answer of this question - Plot ctree using rpart.plot functionality
And the function as.simpleparty doesn't work as it is not an rpart object.
I want to remove the bar graphs underneath and simply get a 1 or a 0 on those nodes telling me how it is classified. As the dtree$finalModel is a Binary Tree object,
prp(dtree$finalModel)
doesn't work.
Upvotes: 4
Views: 2673
Reputation: 1833
It's possible to get a ctree plot without the graphs at the bottom but with the outcome labels without using caret. I've included the caret code below for completeness though.
First setup some data for a reproducible example:
library(caret)
library(partykit)
data("PimaIndiansDiabetes", package = "mlbench")
head(PimaIndiansDiabetes)
pregnant glucose pressure triceps insulin mass pedigree age diabetes
1 6 148 72 35 0 33.6 0.627 50 pos
2 1 85 66 29 0 26.6 0.351 31 neg
3 8 183 64 0 0 23.3 0.672 32 pos
4 1 89 66 23 94 28.1 0.167 21 neg
5 0 137 40 35 168 43.1 2.288 33 pos
6 5 116 74 0 0 25.6 0.201 30 neg
Now find optimal ctree parameters using caret:
fitControl <- trainControl(method = "cv", number = 10)
dtree <- train(
diabetes ~ ., data = PimaIndiansDiabetes,
method = "ctree", trControl = fitControl
)
dtree
Conditional Inference Tree
768 samples
8 predictor
2 classes: 'neg', 'pos'
No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 691, 691, 691, 692, 691, 691, ...
Resampling results across tuning parameters:
mincriterion Accuracy Kappa
0.01 0.7239747 0.3783882
0.50 0.7447027 0.4230003
0.99 0.7525632 0.4198104
Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mincriterion = 0.99.
This is not an ideal model, but hey ho and on we go.
Now build and plot a ctree model using the ctree package with optimal parameters from caret:
ct <- ctree(diabetes ~ ., data = PimaIndiansDiabetes, mincriterion = 0.99)
png("diabetes.ctree.01.png", res=300, height=8, width=14, units="in")
plot(as.simpleparty(ct))
dev.off()
Which gives the following figure without the graphs at the bottom but with the outcome variables ("pos" and "neg") on terminal nodes. It's necessary to use non-default height and width values to avoid overlapping terminal nodes.
Note, care should be taken with 0, 1 outcome variables when using ctree with caret. The caret package with the ctree method defaults to building a regression model with integer or numeric 0, 1 data. Convert the outcome variable to a factor if classification is required.
Upvotes: 2