user3387899
user3387899

Reputation: 611

Variable selection with glmnet

For a university project I have to find a model by means of the glmnet function, which should estimate and select variables at the same time.

Analogeous to an example I found on the internet, I have following R-code:

install.packages("glmnet")

library(glmnet)

n =sample.size=54

npar=16

x=matrix(rnorm(n*npar), n, npar)

y <- sample(1:2, n, replace=TRUE)

fit_lasso <- glmnet(x,y,family="poisson")

fit_lasso

coef(fit_lasso, s=c(0.01,0.1))

predict(fit_lasso,newx=x[1:10,], s=c(0.01,0.005))

I get some output, but I really do not see which are the variables that this procedure selects?

Can somebody please help me by claryfying where I have to look in the output to obtain the selected variables?

Many thanks in advance.

Kind regards,

Pieter Student at the Catholic University of Leuven

Upvotes: 2

Views: 915

Answers (1)

Patrick McCarthy
Patrick McCarthy

Reputation: 2538

I think your chief difficulty is that your example doesn't give discernible variable names to start with. As given, your code has this:

> coef(fit_lasso, s=c(0.01,0.1))
17 x 2 sparse Matrix of class "dgCMatrix"
                       1             2
(Intercept)  0.401355700  0.4418204837
V1           0.056974354  .           
V2          -0.084883137 -0.0005052818
V3           0.020746643  .           
V4           0.038719413  .           
V5           0.029015126  .           
V6          -0.002403163  .           
V7           0.015661047  .           
V8          -0.063540718  .           
V9           .            .           
V10         -0.005408579  .           
V11         -0.038804146  .           
V12          0.070699231  .           
V13          0.028897285  .           
V14          0.032890192  .          

So for lambda=0.01, the variables selected are the non-nulls in column 1, and for lambda=0.1 the variables selected are only the intercept and V2. You could clarify the example a bit by assigning column names:

colnames(x) <- letters[1:16]

> coef(fit_lasso, s=c(0.01,0.1))
17 x 2 sparse Matrix of class "dgCMatrix"
                       1             2
(Intercept)  0.401355700  0.4418204837
a            0.056974354  .           
b           -0.084883137 -0.0005052818
c            0.020746643  .           
d            0.038719413  .           
e            0.029015126  .           
f           -0.002403163  .           
g            0.015661047  .           
h           -0.063540718  .           
i            .            .           
j           -0.005408579  .           
k           -0.038804146  .           
l            0.070699231  .           
m            0.028897285  .           
n            0.032890192  .           
o            .            .           
p            0.026287805  .           

Upvotes: 2

Related Questions