Sparse matrix subsetting with row names

Question

I am using using glmnet for feature selection of a multinomial and cross validation. All is well, but with just under 400 predictors and 4 levels the output becomes a bit messy

X <- matrix(rnorm(350000),nrow=1000,ncol=350)
colnames(X) <- sample(LETTERS,350,TRUE)
Y <- factor(sample(LETTERS[1:5],1000,TRUE),levels =LETTERS[1:5])
out.cvfit <- cv.glmnet(x=X ,y=Y,standardize=TRUE,family="multinomial",parallel = TRUE,type.measure = "class")

So then I get this kind of output:

coef.cv.glmnet(out.cvfit,"lambda.1se")
...
$D
351 x 1 sparse Matrix of class "dgCMatrix"
                     1
(Intercept) 0.06770556
F           .         
L           .         
B           .         
W           .         
V           .         
W           .         
G           .         
X           .         
G           .         
A           .         
G           .         
V           .         
Q           .         
T           .      
...

A bit contrived example since all is zero, since there is no structure, but you get the idea.

Ok, this gets very cumbersome to look at across multiple levels and make summaries of extracted predictors. So, is there a way to extract only non-zero predictors from a sparse matrix ?

milan · Accepted Answer

I've saved the relevant output as a. You can then subset using []. Note that . in dgCMatrix is recognized as 0.

a <- coef.cv.glmnet(out.cvfit,"lambda.1se")$D
a[a[,1]!=0,]

Data used (smaller version of your example).

set.seed(2018)
X <- matrix(rnorm(35000),nrow=1000,ncol=35)
colnames(X) <- sample(LETTERS,35,TRUE)
Y <- factor(sample(LETTERS[1:5],1000,TRUE),levels =LETTERS[1:5])
out.cvfit <- cv.glmnet(x=X ,y=Y,standardize=TRUE,family="multinomial",parallel = TRUE,type.measure = "class")

a <- coef.cv.glmnet(out.cvfit,"lambda.1se")$D
a[a[,1]!=0,]

 (Intercept)            R            G            R            T            Q            L            Z 
 0.017394446 -0.055170396 -0.006943011  0.006151795  0.017039835 -0.009432169 -0.047730565  0.065618965

Sparse matrix subsetting with row names

Answers (1)

Related Questions