Reputation: 121
I extended the predictions to five values from this link. Now, I want to add the new five predicted values (New_Interest_Rate and New_Unemployment_Rate) so I can plot them together in a new figure together with the original timeseries.
import pandas as pd
from sklearn import linear_model
import statsmodels.api as sm
Stock_Market = {'Year': [2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2017,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016,2016],
'Month': [12, 11,10,9,8,7,6,5,4,3,2,1,12,11,10,9,8,7,6,5,4,3,2,1],
'Interest_Rate': [2.75,2.5,2.5,2.5,2.5,2.5,2.5,2.25,2.25,2.25,2,2,2,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75,1.75],
'Unemployment_Rate': [5.3,5.3,5.3,5.3,5.4,5.6,5.5,5.5,5.5,5.6,5.7,5.9,6,5.9,5.8,6.1,6.2,6.1,6.1,6.1,5.9,6.2,6.2,6.1],
'Stock_Index_Price': [1464,1394,1357,1293,1256,1254,1234,1195,1159,1167,1130,1075,1047,965,943,958,971,949,884,866,876,822,704,719]
}
df = pd.DataFrame(Stock_Market,columns=['Year','Month','Interest_Rate','Unemployment_Rate','Stock_Index_Price'])
X = df[['Interest_Rate','Unemployment_Rate']] # here we have 2 variables for multiple regression. If you just want to use one variable for simple linear regression, then use X = df['Interest_Rate'] for example.Alternatively, you may add additional variables within the brackets
Y = df['Stock_Index_Price']
# with sklearn
regr = linear_model.LinearRegression()
regr.fit(X, Y)
print('Intercept: \n', regr.intercept_)
print('Coefficients: \n', regr.coef_)
# prediction with sklearn
New_Interest_Rate = [2.75, 3, 4, 1, 2]
New_Unemployment_Rate = [5.3, 4, 3, 2, 1]
for i in range(len(New_Interest_Rate)):
print (str(i+1) + ' - Predicted Stock Index Price: \n',
regr.predict([[New_Interest_Rate[i] ,New_Unemployment_Rate[i]]]))
# with statsmodels
X = sm.add_constant(X) # adding a constant
model = sm.OLS(Y, X).fit()
predictions = model.predict(X)
print_model = model.summary()
print(print_model)
I cannot figure out how to append that because when I try, an error comes out.
Interest_Rate=Interest_Rate.append(New_Interest_Rate)
TypeError: cannot concatenate object of type "<class 'float'>"; only pd.Series, pd.DataFrame, and pd.Panel (deprecated) objs are valid
My goal is to plot the extended predicted values. I use jupyter notebook. The original code comes from thislink. Thank you!
Upvotes: 1
Views: 1234
Reputation: 311
Running the code you provided seems to work on my computer, but with some warning messages. The versions I'm using are python 3.9.7, pandas 1.3.3-1, sklearn-pandas 2.2.0-1, and statsmodels 0.13.0 . I just saved it to a file and ran it in a terminal with "python copypastedcode.py". I got this output:
Intercept:
1798.4039776258544
Coefficients:
[ 345.54008701 -250.14657137]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
warnings.warn(
1 - Predicted Stock Index Price:
[1422.86238865]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
warnings.warn(
2 - Predicted Stock Index Price:
[1834.43795318]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
warnings.warn(
3 - Predicted Stock Index Price:
[2430.12461156]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
warnings.warn(
4 - Predicted Stock Index Price:
[1643.6509219]
/usr/lib/python3.9/site-packages/sklearn/base.py:441: UserWarning: X does not have valid feature names, but LinearRegression was fitted with feature names
warnings.warn(
5 - Predicted Stock Index Price:
[2239.33758028]
OLS Regression Results
==============================================================================
Dep. Variable: Stock_Index_Price R-squared: 0.898
Model: OLS Adj. R-squared: 0.888
Method: Least Squares F-statistic: 92.07
Date: Wed, 20 Oct 2021 Prob (F-statistic): 4.04e-11
Time: 09:07:19 Log-Likelihood: -134.61
No. Observations: 24 AIC: 275.2
Df Residuals: 21 BIC: 278.8
Df Model: 2
Covariance Type: nonrobust
=====================================================================================
coef std err t P>|t| [0.025 0.975]
-------------------------------------------------------------------------------------
const 1798.4040 899.248 2.000 0.059 -71.685 3668.493
Interest_Rate 345.5401 111.367 3.103 0.005 113.940 577.140
Unemployment_Rate -250.1466 117.950 -2.121 0.046 -495.437 -4.856
==============================================================================
Omnibus: 2.691 Durbin-Watson: 0.530
Prob(Omnibus): 0.260 Jarque-Bera (JB): 1.551
Skew: -0.612 Prob(JB): 0.461
Kurtosis: 3.226 Cond. No. 394.
==============================================================================
Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
the "X does not have valid feature names..." warnings can be fixed by changing
regr.fit(X,Y)
to
regr.fit(X.values, Y.values)
If you want to use New_Interest_rate and New_Unemployment_Rate to create the regression, then you would need Y to have 5 more corresponding stock prices. I don't think that's what you want to do if you're trying to predict stock prices from interest and unemployment rates. Here's how you would do that though:
New_Interest_Rate = [2.75, 3, 4, 1, 2]
New_Unemployment_Rate = [5.3, 4, 3, 2, 1]
New_Stock_Prices = [1,2,3,4,5]
X_new = pd.DataFrame(data={'Interest_Rate': New_Interest_Rate,'Unemployment_Rate': New_Unemployment_Rate})
Y_new = pd.DataFrame(data={'Stock_Index_Price': New_Stock_Prices})
regr = linear_model.LinearRegression()
X = X.append(X_df)
Y = Y.append(Y_df)
regr.fit(X.values, Y.values)
And if you want to make plots, you can make a small function to get stock predictions from input arrays with something like this:
def predict_stock_price(future_interest_rate, future_unemployment_rate):
return [regr.predict([[i ,j]])[0,0] for i,j in zip(future_interest_rate,future_unemployment_rate)]
prices = predict_stock_price(New_Interest_Rate,New_Unemployment_Rate)
print("list of predicted stock prices:",prices)
predicted_stock_market = {'Month': range(13,13+len(prices)), #just to have a time axis to plot with
'Interest_Rate': New_Interest_Rate,
'Unemployment_Rate': New_Unemployment_Rate,
'Stock_Index_Price': prices}
predicted_df = pd.DataFrame(predicted_stock_market)
predicted_df.plot( x="Month",y="Stock_Index_Price",kind='scatter')
plt.show()
Upvotes: 3