Reputation: 503
Is it possible to do a lasso model with both penalized and un-penalized covariates? That is, I want to do an estimate with Y ~ gamma * X + beta * Z
, where X is a n*p
penalized features and Z a n*q
un-penalized covariates of continues or factor variables.
Thanks.
Upvotes: 0
Views: 656
Reputation: 366
It is clearly stated in the vignette under the section called Penalty Factors. To ensure some variables are not penalized, set the penalty.factor to 0. You just need to create a vector of length ncol(X) + ncol(Z)
where the first ncol(X)
entries are 1 (or any positive non-zero number) and the other ncol(Z)
entries are 0. For example:
set.seed(1234)
n = 100 # number of samples
px = 5 # number of x variables
pz = 5 # number of z variables
x <- matrix(rnorm(n*px), ncol = px)
z <- matrix(rnorm(n*pz), ncol = pz)
y <- x[,1] + x[,5] + 2*z[,1] + 3*rnorm(n) # generate response
penalty <- c(rep(1, px), rep(0, pz)) # penalty factor
plot(glmnet::glmnet(cbind(x,z), y, penalty.factor = penalty))
Notice in the plot of the solution path, 5 of the variables are never 0 because they are never penalized.
Upvotes: 1