Fredrik Nylén
Fredrik Nylén

Reputation: 577

Sparse matrix subsetting with row names

I am using using glmnet for feature selection of a multinomial and cross validation. All is well, but with just under 400 predictors and 4 levels the output becomes a bit messy

X <- matrix(rnorm(350000),nrow=1000,ncol=350)
colnames(X) <- sample(LETTERS,350,TRUE)
Y <- factor(sample(LETTERS[1:5],1000,TRUE),levels =LETTERS[1:5])
out.cvfit <- cv.glmnet(x=X ,y=Y,standardize=TRUE,family="multinomial",parallel = TRUE,type.measure = "class")

So then I get this kind of output:

coef.cv.glmnet(out.cvfit,"lambda.1se")
...
$D
351 x 1 sparse Matrix of class "dgCMatrix"
                     1
(Intercept) 0.06770556
F           .         
L           .         
B           .         
W           .         
V           .         
W           .         
G           .         
X           .         
G           .         
A           .         
G           .         
V           .         
Q           .         
T           .      
...

A bit contrived example since all is zero, since there is no structure, but you get the idea.

Ok, this gets very cumbersome to look at across multiple levels and make summaries of extracted predictors. So, is there a way to extract only non-zero predictors from a sparse matrix ?

Upvotes: 1

Views: 1950

Answers (1)

milan
milan

Reputation: 4980

I've saved the relevant output as a. You can then subset using []. Note that . in dgCMatrix is recognized as 0.

a <- coef.cv.glmnet(out.cvfit,"lambda.1se")$D
a[a[,1]!=0,]

Data used (smaller version of your example).

set.seed(2018)
X <- matrix(rnorm(35000),nrow=1000,ncol=35)
colnames(X) <- sample(LETTERS,35,TRUE)
Y <- factor(sample(LETTERS[1:5],1000,TRUE),levels =LETTERS[1:5])
out.cvfit <- cv.glmnet(x=X ,y=Y,standardize=TRUE,family="multinomial",parallel = TRUE,type.measure = "class")

a <- coef.cv.glmnet(out.cvfit,"lambda.1se")$D
a[a[,1]!=0,]

 (Intercept)            R            G            R            T            Q            L            Z 
 0.017394446 -0.055170396 -0.006943011  0.006151795  0.017039835 -0.009432169 -0.047730565  0.065618965 

Upvotes: 1

Related Questions