user3120266
user3120266

Reputation: 425

Labeling issues for rpart in decision tree in R

I am trying to do a tree for a large dataset that I have. I can run the tree fine and receive no error. However, when I look at the labels for the tree they are very messy and not legible. Additionally, I feel the results are not correct. (FYI, I took out some of the variables in the code below so not just scrolling through all the variables, the problem happens with many or just a couple of variables)

For example,the EMPLOY1 split is on =j, but the values in the variables are "unable to work", "retired", etc. Any thoughts what I am doing wrong with the tree output?

Code:

library(rpart)
fit <- rpat(poorhealth_cat ~
SCNTWRK1+
SCNTLWK1+
SCNTMEAL+
SCNTMONY+
SCNTPAID+
SEX+
SLEPTIM1+
SMOKE100+
SMOKDAY2+
STRENGTH+
TOLDHI2+
USENOW3+
WEIGHT2+
WTCHSALT+
FRT16+
, method="class", data=cdc) # grow tree
printcp(fit) # display the results
 plotcp(fit) # visualize cross-validation results
summary(fit) # detailed summary of split

# plot unpruned tree
plot(fit,uniform=TRUE, main="Classification Tree for poorhealth_cat")
text(fit, use.n=TRUE, all=TRUE, cex=.8)

!enter image description here

Upvotes: 1

Views: 1773

Answers (1)

JohnCoene
JohnCoene

Reputation: 2261

I encountered the same issue. Still not sure why but I "fixed" it by using the following instead.

#install
install.packages('rattle')
install.packages('rpart.plot')
install.packages('RColorBrewer')

#load
library(rattle)
library(rpart.plot)
library(RColorBrewer)

#plot
fancyRpartPlot(fit)

Labels are right.

Upvotes: 1

Related Questions