Reputation: 1141
I have a question about LASSO. I'm getting crazy because it is something that I can not solve only according to my background. I'm a biologist.
Briefly I run LASSO using the R library "penalized". In particular I used the opt1D function with around 500 simulations on a data.frame (numerical) of around 30 columns that are my biomarkers (gene expression). I want to test and 3000 rows that are people of which around 50 are tumours and all the others are normals.
Unfortunately by using L1 regularization, all and really all coefficients of 500 simulations are 0. If I check L2 matrix of coefficients they are close to 0. Now my point is that I cannot think that all my biomarkers are not able to distinguish between Normals and Tumors.
I don't know if what I have done is all I can to check for the discriminatory potential of my molecules. Is there something else I can do to understand why are they all 0 and also is there something else I can do to verify that really they are not able to stratify my cohort?
Upvotes: 0
Views: 110
Reputation: 86
Did you consider fitting your data without penalization before using regularization? L1 regularization will naturally result in a significant number of zero coefficients.
As a side note I would first run PCA/PCoA and see whether or not your genes separate according to your class variable. This could save you some time and allow you to trim your data set to those genes that show the greatest differences across your class variable. Also if you have relatively little experience with R I would suggest using a linear modeling package such as Limma since it has excellent documentation and many examples that are easy to follow.
Upvotes: 1