Alex
Alex

Reputation: 43

Error in model.matrix.default(mt, mf, contrasts) : variable 1 has no levels

I've been getting the error message in the subject line. I am using the book Introduction to Statistical Learning with applications in R and seem have to done everything by the book. I have even copied and pasted the code directly from the book into R Studio and it still shoots the error message. I will paste my code from R here. Thanks ahead of time for your help.

library(ISLR)
attach(Smarket)

glm.fit=glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume,
            family=binomial)
summary(glm.fit)
glm.probs=predict(glm.fit, type = "response")
glm.probs[1:10]
glm.pred=rep("Down", 1250)
glm.pred[glm.probs>.5] = "Up"
glm.pred
table(glm.pred, Direction)
mean(glm.pred==Direction)
train=(Year>2005)
Smarket.2005=Smarket[!train,]
dim(Smarket.2005)
Direction.2005=Direction[!train]
glm.fit = glm(Direction ~ Lag1 + Lag2 + Lag3 + Lag4 + Lag5 + Volume, family = binomial, subset = train)

It shoots the error after the last line (glm.fit = ).

Error message (again): Error in model.matrix.default(mt, mf, contrasts) : variable 1 has no levels

Upvotes: 4

Views: 19082

Answers (1)

davechilders
davechilders

Reputation: 9143

Look at the distribution of the Year variable:

> table(Year)
Year
2001 2002 2003 2004 2005 
 242  252  252  252  252 

No rows have values larger than 2005.

So when you define train:

train=(Year>2005)

that condition evaluates to FALSE for every element of the vector.

When you specify subset = train in your glm call, you are selecting a subset that has no rows.

Upvotes: 5

Related Questions