Adam Demo_Fighter
Adam Demo_Fighter

Reputation: 77

Polynomial regression using sklearn, returning incorrect result

I am trying to do a polynomial regression using python sklearn library, but the result I get is very different from the one I get from excel.

code:

def polynomial_regression(x_param, y_param):
    print(x_param)
    print(y_param)
    """create a polynomial regression graph"""
    # convert x_param features to a numpy array
    x_param = np.array(x_param)

    # save a PolynomialFeatures with degree of 3
    poly = PolynomialFeatures(degree=3, include_bias=False)

    # we fit and transform the numpy array x_param
    poly_features = poly.fit_transform(x_param.reshape(-1, 1))

    # create a LinearRegression instance
    poly_reg_model = LinearRegression()

    # we fit our model to our data
    # which means we train our models by introducing poly_features and y_params values
    poly_reg_model.fit(poly_features, y_param)

    # predict the response 'y_predicted' based on the poly_features and the coef it estimated
    y_predicted = poly_reg_model.predict(poly_features)

    # visualising our model
    plt.figure(figsize=(10, 6))
    plt.title(f"Polynomial regression, coef={poly_reg_model.coef_}", size=16)
    plt.scatter(x_param, y_param)
    plt.plot(x_param, y_predicted, c="red")
    plt.show()

result: results

expected result: expected result

now is the results suppose to look like this ? if so why , if no what am i doing wrong ? thanks for your help in advance.

Upvotes: 0

Views: 418

Answers (1)

Hanno Reuvers
Hanno Reuvers

Reputation: 646

@Adam Demo_Fighter: Let me maybe post a solution to clarify the remark by @Andrey Lukyanenko.

The issue is plt.plot(x_param, y_predicted, c="red"). This plot command will connect the subsequent points from x_param and y_predicted by line segments. If the values in x_param are not monotonic, then this creates the zig zag patterns that appear in your plot. The solution is simply to order the list of x values before carrying out the analysis.

import matplotlib.pyplot as plt
import numpy as np
from sklearn.preprocessing import PolynomialFeatures
from sklearn.linear_model import LinearRegression

# Original data
xraw = [3, 6, 6, 9, 12, 15, 3, 6, 13, 9, 13, 16]
yraw = [9, 12, 9, 9, 12, 9, 6, 3, 8, 3, 1, 3]
# Ordered data in x:
OrderedID   = np.argsort(xraw)
x = np.array(xraw)[OrderedID]
y = np.array(yraw)[OrderedID]

print(x)
print(y)

def polynomial_regression(x_param, y_param):
    """create a polynomial regression graph"""
    # save a PolynomialFeatures with degree of 3
    poly = PolynomialFeatures(degree=3, include_bias=False)
    print(poly)

    # we fit and transform the numpy array x_param
    poly_features = poly.fit_transform(x_param.reshape(-1, 1))

    # create a LinearRegression instance
    poly_reg_model = LinearRegression()

    # we fit our model to our data
    # which means we train our models by introducing poly_features and y_params values
    temp = poly_reg_model.fit(poly_features, y_param)
    print(temp)

    # predict the response 'y_predicted' based on the poly_features and the coef it estimated
    y_predicted = poly_reg_model.predict(poly_features)

    # visualising our model
    plt.figure(figsize=(10, 6))
    plt.title(f"Polynomial regression, coef={poly_reg_model.coef_}", size=16)
    plt.scatter(x_param, y_param)
    plt.plot(x_param, y_predicted, c="red")
    plt.show()

polynomial_regression(x, y)

PS1: I transformed to np.array outside of the function.

PS2: Nice profile pic =).

Upvotes: 1

Related Questions