user2043236
user2043236

Reputation: 71

Lasso path [linear_model.lars_path(model = 'lasso')]

I am confused about the behavior of the lasso path when running linear_model.lars_path(model = 'lasso') in scikit-learn.

I thought that once a weight (coefficient) becomes active (diff from 0) it must remain active during all the forthcoming steps of the LARS algorithm.

When running the algorithm on my data, I noticed that sometimes a coefficient would become active then later it would go to zero (be removed from the active set). It this the correct behavior of the LARS algorithm, or could there be a bug in the scikit-learn implementation?.

Upvotes: 1

Views: 1245

Answers (1)

user1149913
user1149913

Reputation: 4523

This is correct behavior for the L1-regularized version of LARS (and L1-regularized regression is generally known as "lasso").

In the L1 version, if a step along the LARS path causes the sign of the correlation of a column of the data matrix with the residual to be different from the sign of the corresponding regression coefficient (e.g. sgn(x_i^\top(y-X\beta)) != sgn(\beta_i) ), then this column/coefficient will be removed from the active path. (You can find the original description in the 2003 paper "Least Angle Regression" by Efron et al.)

In contrast, the active set grows at each iteration in the L0-regularized version of LARS.

Upvotes: 1

Related Questions