bud fox
bud fox

Reputation: 335

Use matplotlib to plot scikit learn linear regression results

How can you plot the linear regression results from scikit learn after the analysis to see the "testing" data (real values vs. predicted values) at the end of the program? The code below is close but I believe it is missing a scaling factor.

input:

import pandas as pd
import numpy as np
import datetime

pd.core.common.is_list_like = pd.api.types.is_list_like # temp fix
import fix_yahoo_finance as yf
from pandas_datareader import data, wb
from datetime import date
from sklearn.linear_model import LinearRegression
from sklearn import preprocessing, cross_validation, svm
import matplotlib.pyplot as plt

df = yf.download('MMM', start = date (2012, 1, 1), end = date (2018, 1, 1) , progress = False)
df_low = df[['Low']] # create a new df with only the low column
forecast_out = int(5) # predicting some days into future
df_low['low_prediction'] = df_low[['Low']].shift(-forecast_out) # create a new column based on the existing col but shifted some days

X_low = np.array(df_low.drop(['low_prediction'], 1))
X_low = preprocessing.scale(X_low) # scaling the input values

X_low_forecast = X_low[-forecast_out:] # set X_forecast equal to last 5 days
X_low = X_low[:-forecast_out] # remove last 5 days from X

y_low = np.array(df_low['low_prediction'])
y_low = y_low[:-forecast_out]

X_low_train, X_low_test, y_low_train, y_low_test = cross_validation.train_test_split(X_low, y_low, test_size = 0.2)

clf_low = LinearRegression() # classifier
clf_low.fit(X_low_train, y_low_train) # training

confidence_low = clf_low.score(X_low_test, y_low_test) # testing

print("confidence for lows: ", confidence_low)
forecast_prediction_low = clf_low.predict(X_low_forecast)
print(forecast_prediction_low)

plt.figure(figsize = (17,9))
plt.grid(True)
plt.plot(X_low_test, color = "red")
plt.plot(y_low_test, color = "green")
plt.show()

image:

enter image description here

Upvotes: 2

Views: 1335

Answers (1)

Nick
Nick

Reputation: 173

You plot y_test and X_test, while you should plot y_test and clf_low.predict(X_test) instead, if you want to compare target and predicted.

BTW, clf_low in your code is not a classifier, it is a regressor. It's better to use the alias model instead of clf.

Upvotes: 2

Related Questions