The Singularity
The Singularity

Reputation: 11

Classification Tree in R multiple times

I have a problem when running a classification tree in R using the function tree() and the following piece of code:

library(tree)

library(ISLR)

attach(Carseats)

High=ifelse(Sales<=8, "No", "Yes") 

Carseats=data.frame(Carseats, High)

tree.carseats=tree(High~.-Sales, Carseats)


summary(tree.carseats)

The problem is that when I run all the code together for the first time, I get the same results as the book I am referring to (Introduction to Statistical Learning):

Classification tree:
tree(formula = High ~ . - Sales, data = Carseats)
Variables actually used in tree construction:
[1] "ShelveLoc"   "Price"       "Income"      "CompPrice"   "Population"  "Advertising" "Age"         "US"         
Number of terminal nodes:  27 
Residual mean deviance:  0.4575 = 170.7 / 373 
Misclassification error rate: 0.09 = 36 / 400 

However, when I run the same code again the tree is not providing any more meaningful results:

Classification tree:
tree(formula = High ~ . - Sales, data = Carseats)
Variables actually used in tree construction:
[1] "High.1"
Number of terminal nodes:  2 
Residual mean deviance:  0 = 0 / 398 
Misclassification error rate: 0 = 0 / 400 

Can someone explain me what is going on?

Thanks.

Upvotes: 1

Views: 165

Answers (1)

Happier
Happier

Reputation: 81

It's been along time but I still hope my answer could help you and others who have had the same problem.

I think the problem is in the variable name "Carseats" when you assign new data.frame into the same name as the whole dataset. I did change the name into "Car" (for example) and it worked:

 library(tree)
 library(ISLR)
 attach(Carseats)
 High = ifelse(Sales <= 8, "No", "Yes")
 Car = data.frame(Carseats, High)
 tree.carseats = tree(High~.-Sales, Car)
 summary(tree.carseats)

or you can use the other way as below:

library(tree)
library(ISLR)
attach(Carseats)
High = ifelse(Sales <= 8, "No", "Yes")
New = cbind(Carseats, High)
tree.carseats = tree(High~.-Sales, New)
summary(tree.carseats)

I used cbind() to combine Carseats dataset and High into a new dataset named "New".

Maybe this issue (If you do the same as the book) came from the difference of Rstudio version that the book (ISLR) did not mention.

Hope this could help! :)

Upvotes: 1

Related Questions