Reputation: 55
I have two questions about prediction using GLMNET - specifically about the intercept.
I made a small example of train data creation, GLMNET estimation and prediction on the train data (which I will later change to Test data):
# Train data creation
Train <- data.frame('x1'=runif(10), 'x2'=runif(10))
Train$y <- Train$x1-Train$x2+runif(10)
# From Train data frame to x and y matrix
y <- Train$y
x <- as.matrix(Train[,c('x1','x2')])
# Glmnet model
Model_El <- glmnet(x,y)
Cv_El <- cv.glmnet(x,y)
# Prediction
Test_Matrix <- model.matrix(~.-y,data=Train)[,-1]
Test_Matrix_Df <- data.frame(Test_Matrix)
Pred_El <- predict(Model_El,newx=Test_Matrix,s=Cv_El$lambda.min,type='response')
I want to have an intercept in the estimated formula. This code gives an error concerning the dimensions of the Test_Matrix matrix unless I remove the (Intercept) column of the matrix - as in
Test_Matrix <- model.matrix(~.-y,data=Train)[,-1]
My questions are:
Is it the right way to do this in order to get the prediction - when I want the prediction formula to include the intercept?
If it is the right way: Why do I have to remove the intercept in the matrix?
Thanks in advance.
Upvotes: 1
Views: 4456
Reputation: 1495
The matrix x
you were feeding into the glmnet
function doesn't contain an intercept column. Therefore, you should respect this format when constructing your test matrix: i.e. just do model.matrix(y ~ . - 1, data = Train)
.
By default, an intercept is fit in glmnet (see the intercept
parameter in the glmnet function). Therefore, when you called glmnet(x, y)
, you are technically doing glmnet(x, y, intercept = T)
. Thus, even though your x
matrix didn't have an intercept, one was fit for you.
Upvotes: 3
Reputation: 73325
If you want to predict a model with intercept, you have to fit a model with intercept. Your code used model matrix x <- as.matrix(Train[,c('x1','x2')])
which is intercept-free, therefore if you provide an intercept when using predict
, you get an error.
You can do the following:
x <- model.matrix(y ~ ., Train) ## model matrix with intercept
Model_El <- glmnet(x,y)
Cv_El <- cv.glmnet(x,y)
Test_Matrix <- model.matrix(y ~ ., Train) ## prediction matrix with intercept
Pred_El <- predict(Model_El, newx = Test_Matrix, s = Cv_El$lambda.min, type='response')
Note, you don't have to do
model.matrix(~ . -y)
model.matrix
will ignore the LHS of the formula, so it is legitimate to use
model.matrix(y ~ .)
Upvotes: 2