tester9271
tester9271

Reputation: 41

How does parameter selection work in LassoCV when nothing is supplied?

Basically, I am wondering how LassoCV in sklearn chooses values of alpha (the shrinkage parameter) when none are provided. When you run the statement,

reg = LassoCV(cv = 5) # cv = 5 for 5 fold cross val
reg.fit(X, Y)

I am happy with the results that I am getting; however, I am curious as to how the model chooses the optimal alpha. Is it simply iterating through all alphas in a range with a given tolerance?

Other than that, I also wanted to ask what happens when you supply it values of alpha or use the n_alphas parameter, i.e.:

reg = LassoCV(cv = 5, alphas = [.1, .2, .001, ...])
reg = LassoCV(cv = 5, n_alphas = 100)

How does it determine which one of these alpha values are best? What alphas does it cycle through when providing a number of alphas?

Thank you.

Upvotes: 4

Views: 5824

Answers (1)

desertnaut
desertnaut

Reputation: 60370

How does it determine which one of these alpha values are best?

It goes through a cross-validation procedure with all submitted values of alpha, and returns the one with the greatest score, which, according to the docs, is the coefficient of determination R^2.

What alphas does it cycle through when providing a number of alphas?

It's easy to see with a simple example; asking for only n_alphas=5 for simplicity, we get:

from sklearn.linear_model import LassoCV
from sklearn.datasets import make_regression
X, y = make_regression(noise=4, random_state=0)
reg = LassoCV(cv=5, n_alphas=5, random_state=0).fit(X, y)

According to the docs, one of the attributes of the fitted object is:

alphas_ : numpy array, shape (n_alphas,)

The grid of alphas used for fitting

So, here we have:

reg.alphas_
# result:
array([  6.92751635e+01,   1.23190597e+01,   2.19067302e+00,
         3.89562872e-01,   6.92751635e-02])

The exact values are again indirectly determined by the parameter eps, which has a default value of 0.001; again from the docs:

eps : float, optional

Length of the path. eps=1e-3 means that alpha_min / alpha_max = 1e-3.

So, essentially it sets a grid of possible aplhas such as the ratio of the minimum to the maximum value is equal to eps, here 0.001; let's verify that this is the case in our simple example:

reg.alphas_[4]/reg.alphas_[0]
# result
0.00099999999999999959

which, for all practical purposes, is indeed equal to 0.001.

Upvotes: 3

Related Questions