AVT
AVT

Reputation: 119

Plotting a ctree method decision tree in caret, remove unwanted bargraph underneath

I'm running a ctree method model in caret and trying to plot the decision tree I get. This is the main portion of my code.

fitControl <- trainControl(method = "cv", number = 10)
dtree <- train(
  Outcome ~ ., data = training_set, 
  method = "ctree", trControl = fitControl
)

I'm trying to plot the decision tree and I use

plot(dtree$finalModel)

which gives me this -

decision tree

The picture's not good here but I get an image similar to the first plot in the answer of this question - Plot ctree using rpart.plot functionality

And the function as.simpleparty doesn't work as it is not an rpart object.

I want to remove the bar graphs underneath and simply get a 1 or a 0 on those nodes telling me how it is classified. As the dtree$finalModel is a Binary Tree object,

prp(dtree$finalModel)

doesn't work.

Upvotes: 4

Views: 2673

Answers (1)

makeyourownmaker
makeyourownmaker

Reputation: 1833

It's possible to get a ctree plot without the graphs at the bottom but with the outcome labels without using caret. I've included the caret code below for completeness though.

First setup some data for a reproducible example:

library(caret)    
library(partykit)
data("PimaIndiansDiabetes", package = "mlbench")
head(PimaIndiansDiabetes)
      pregnant glucose pressure triceps insulin mass pedigree age diabetes
1        6     148       72      35       0 33.6    0.627  50      pos
2        1      85       66      29       0 26.6    0.351  31      neg
3        8     183       64       0       0 23.3    0.672  32      pos
4        1      89       66      23      94 28.1    0.167  21      neg
5        0     137       40      35     168 43.1    2.288  33      pos
6        5     116       74       0       0 25.6    0.201  30      neg

Now find optimal ctree parameters using caret:

fitControl <- trainControl(method = "cv", number = 10)
dtree <- train(
  diabetes ~ ., data = PimaIndiansDiabetes, 
  method = "ctree", trControl = fitControl
)

dtree
Conditional Inference Tree

768 samples
  8 predictor
  2 classes: 'neg', 'pos'

No pre-processing
Resampling: Cross-Validated (10 fold)
Summary of sample sizes: 691, 691, 691, 692, 691, 691, ...
Resampling results across tuning parameters:

  mincriterion  Accuracy   Kappa
  0.01          0.7239747  0.3783882
  0.50          0.7447027  0.4230003
  0.99          0.7525632  0.4198104

Accuracy was used to select the optimal model using the largest value.
The final value used for the model was mincriterion = 0.99.

This is not an ideal model, but hey ho and on we go.

Now build and plot a ctree model using the ctree package with optimal parameters from caret:

ct <- ctree(diabetes ~ ., data = PimaIndiansDiabetes, mincriterion = 0.99)

png("diabetes.ctree.01.png", res=300, height=8, width=14, units="in")
plot(as.simpleparty(ct))
dev.off()

Which gives the following figure without the graphs at the bottom but with the outcome variables ("pos" and "neg") on terminal nodes. It's necessary to use non-default height and width values to avoid overlapping terminal nodes.

Diabetes ctree diagram

Note, care should be taken with 0, 1 outcome variables when using ctree with caret. The caret package with the ctree method defaults to building a regression model with integer or numeric 0, 1 data. Convert the outcome variable to a factor if classification is required.

Upvotes: 2

Related Questions