Reputation: 177
I understand that both LinearRegression
class and SGDRegressor
class from scikit-learn
performs linear regression. However, only SGDRegressor
uses Gradient Descent as the optimization algorithm.
Then what is the optimization algorithm used by LinearRegression
, and what are the other significant differences between these two classes?
Upvotes: 3
Views: 7201
Reputation: 151
To understand the algorithm used by LinearRegression, we must have in mind that there is (in favorable cases) an analytical solution (with a formula) to find the coefficients which minimize the least squares:
theta = (X'X)^(-1)X'Y (1)
where X' is the the transpose matrix of X.
In the case of non-invertibility, the inverse can be replaced by the Moore-Penrose pseudo-inverse calculated using "singular value decomposition" (SVD). And even in the case of invertibility, the SVD method is faster and more stable than applying the formula (1).
PS - No LaTeX (MathJaX) in Stackoverflow ???
-- Pierre (from France)
Upvotes: 0
Reputation: 2316
LinearRegression always uses the least-squares as a loss function.
For SGDRegressor you can specify a loss function and it uses Stochastic Gradient Descent (SGD) to fit. For SGD you run the training set one data point at a time and update the parameters according to the error gradient.
In simple words - you can train SGDRegressor on the training dataset, that does not fit into RAM. Also, you can update the SGDRegressor model with a new batch of data without retraining on the whole dataset.
Upvotes: 6