Reputation: 43
I've been getting the error message in the subject line. I am using the book Introduction to Statistical Learning with applications in R and seem have to done everything by the book. I have even copied and pasted the code directly from the book into R Studio and it still shoots the error message. I will paste my code from R here. Thanks ahead of time for your help.
library(ISLR)
attach(Smarket)
glm.fit=glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume,
family=binomial)
summary(glm.fit)
glm.probs=predict(glm.fit, type = "response")
glm.probs[1:10]
glm.pred=rep("Down", 1250)
glm.pred[glm.probs>.5] = "Up"
glm.pred
table(glm.pred, Direction)
mean(glm.pred==Direction)
train=(Year>2005)
Smarket.2005=Smarket[!train,]
dim(Smarket.2005)
Direction.2005=Direction[!train]
glm.fit = glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume, family = binomial, subset = train)
It shoots the error after the last line (glm.fit = ).
Error message (again): Error in model.matrix.default(mt, mf, contrasts) : variable 1 has no levels
Upvotes: 4
Views: 19082
Reputation: 9143
Look at the distribution of the Year
variable:
> table(Year)
Year
2001 2002 2003 2004 2005
242 252 252 252 252
No rows have values larger than 2005.
So when you define train:
train=(Year>2005)
that condition evaluates to FALSE
for every element of the vector.
When you specify subset = train
in your glm
call, you are selecting a subset that has no rows.
Upvotes: 5