micah
micah

Reputation: 8116

LinearRegression Predict- ValueError: matrices are not aligned

I've been searching google and can't figure out what I'm doing wrong. I'm pretty new to python and trying to use scikit on stocks but I'm getting the error "ValueError: matrices are not aligned" when trying to predict.

import datetime

import numpy as np
import pylab as pl
from matplotlib import finance
from matplotlib.collections import LineCollection

from sklearn import cluster, covariance, manifold, linear_model

from sklearn import datasets, linear_model

###############################################################################
# Retrieve the data from Internet

# Choose a time period reasonnably calm (not too long ago so that we get
# high-tech firms, and before the 2008 crash)
d1 = datetime.datetime(2003, 01, 01)
d2 = datetime.datetime(2008, 01, 01)

# kraft symbol has now changed from KFT to MDLZ in yahoo
symbol_dict = {
    'AMZN': 'Amazon'}

symbols, names = np.array(symbol_dict.items()).T

quotes = [finance.quotes_historical_yahoo(symbol, d1, d2, asobject=True)
          for symbol in symbols]

open = np.array([q.open for q in quotes]).astype(np.float)
close = np.array([q.close for q in quotes]).astype(np.float)

# The daily variations of the quotes are what carry most information
variation = close - open

#########

pl.plot(range(0, len(close[0])-20), close[0][:-20], color='black')

model = linear_model.LinearRegression(normalize=True)
model.fit([close[0][:-1]], [close[0][1:]])

print(close[0][-20:])
model.predict(close[0][-20:])


#pl.plot(range(0, 20), model.predict(close[0][-20:]), color='red')

pl.show()

The error line is

model.predict(close[0][-20:])

I've tried nesting it in a list. Making it an array with numpy. Anything I could find on google but I have no idea what I'm doing here.

What does this error mean and why is it happening?

Upvotes: 0

Views: 2964

Answers (1)

CT Zhu
CT Zhu

Reputation: 54400

Trying to predict stock price by simple linear regression? :^|. Anyway, this is what you need to change:

In [19]:

M=model.fit(close[0][:-1].reshape(-1,1), close[0][1:].reshape(-1,1))
In [31]:

M.predict(close[0][-20:].reshape(-1,1))
Out[31]:
array([[ 90.92224274],
       [ 94.41875811],
       [ 93.19997275],
       [ 94.21895723],
       [ 94.31885767],
       [ 93.030142  ],
       [ 90.76240203],
       [ 91.29187436],
       [ 92.41075928],
       [ 89.0940647 ],
       [ 85.10803717],
       [ 86.90624508],
       [ 89.39376602],
       [ 90.59257129],
       [ 91.27189427],
       [ 91.02214318],
       [ 92.86031126],
       [ 94.25891741],
       [ 94.45871828],
       [ 92.65052033]])

Remember, when you build a model, X and y for .fit method should have the shape of [n_samples,n_features]. The same applies to the .predict method.

Upvotes: 2

Related Questions