Kristijan
Kristijan

Reputation: 195

Distance between Linear Regression slope and data point

I have fit a LinearRegression() model. What I want to do now is basically calculate the distance between some data points and the regression line.

My datapoints are two dimensional points (x, y)

My question is: How can I get the equation of the line from the LinearRegression() model?

Upvotes: 2

Views: 2039

Answers (2)

Vivek Kalyanarangan
Vivek Kalyanarangan

Reputation: 9081

After you have fit the model, you can either call the coef and intercept_ attributes to see what the coefficients and the intercept are respectively.

But this would involve writing a constructed formula for your model. My recommendation is once you build your model, make the predictions and score it against the true y values -

from sklearn.metrics import mean_squared_error
mean_squared_error(y_test, y_pred) # y_test are true values, y_pred are the predictions that you get by calling regression.predict()

If the goal is to calculate distances, you sklearn.metrics convenience functions instead of looking for the equation and hand-computing it yourself. The manual way to do that will be -

import numpy as np
y_pred = np.concatenate(np.ones(X_test.shape[0]), X_test) * np.insert(clf.coef_,0,clf.intercept_)
sq_err = np.square(y_pred - y_test)
mean_sq_err = np.mean(sq_err)

Upvotes: 2

cs95
cs95

Reputation: 402872

From the documentation, use clf.coef_ for the weight vector(s) and clf.intercept_ for the bias:

coef_ : array, shape (n_features, ) or (n_targets, n_features)
Estimated coefficients for the linear regression problem. If multiple targets are passed during the fit (y 2D), this is a 2D array of shape (n_targets, n_features), while if only one target is passed, this is a 1D array of length n_features.

intercept_ : array Independent term in the linear model.

Once you have these, see here.

Upvotes: 1

Related Questions