Reputation: 611
For a university project I have to find a model by means of the glmnet function, which should estimate and select variables at the same time.
Analogeous to an example I found on the internet, I have following R-code:
install.packages("glmnet")
library(glmnet)
n =sample.size=54
npar=16
x=matrix(rnorm(n*npar), n, npar)
y <- sample(1:2, n, replace=TRUE)
fit_lasso <- glmnet(x,y,family="poisson")
fit_lasso
coef(fit_lasso, s=c(0.01,0.1))
predict(fit_lasso,newx=x[1:10,], s=c(0.01,0.005))
I get some output, but I really do not see which are the variables that this procedure selects?
Can somebody please help me by claryfying where I have to look in the output to obtain the selected variables?
Many thanks in advance.
Kind regards,
Pieter Student at the Catholic University of Leuven
Upvotes: 2
Views: 915
Reputation: 2538
I think your chief difficulty is that your example doesn't give discernible variable names to start with. As given, your code has this:
> coef(fit_lasso, s=c(0.01,0.1))
17 x 2 sparse Matrix of class "dgCMatrix"
1 2
(Intercept) 0.401355700 0.4418204837
V1 0.056974354 .
V2 -0.084883137 -0.0005052818
V3 0.020746643 .
V4 0.038719413 .
V5 0.029015126 .
V6 -0.002403163 .
V7 0.015661047 .
V8 -0.063540718 .
V9 . .
V10 -0.005408579 .
V11 -0.038804146 .
V12 0.070699231 .
V13 0.028897285 .
V14 0.032890192 .
So for lambda=0.01, the variables selected are the non-nulls in column 1, and for lambda=0.1 the variables selected are only the intercept and V2. You could clarify the example a bit by assigning column names:
colnames(x) <- letters[1:16]
> coef(fit_lasso, s=c(0.01,0.1))
17 x 2 sparse Matrix of class "dgCMatrix"
1 2
(Intercept) 0.401355700 0.4418204837
a 0.056974354 .
b -0.084883137 -0.0005052818
c 0.020746643 .
d 0.038719413 .
e 0.029015126 .
f -0.002403163 .
g 0.015661047 .
h -0.063540718 .
i . .
j -0.005408579 .
k -0.038804146 .
l 0.070699231 .
m 0.028897285 .
n 0.032890192 .
o . .
p 0.026287805 .
Upvotes: 2