Reputation: 125
I'm re-visiting a machine learning tutorial I did earlier in the year and as I've got a new laptop it seems to have thrown up some compatibility issues. I've looked at several other SO answers and solved it partly based on what seem to be new name requirements within the most recent version of SKlearn. Here is the code, which ran fine when I did the tutorial
import quandl, math
import numpy as np
import pandas as pd
from sklearn import preprocessing, cross_validation, svm
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
from matplotlib import style
import datetime
style.use('ggplot')
df = quandl.get("WIKI/GOOGL")
df = df[['Adj. Open', 'Adj. High', 'Adj. Low', 'Adj. Close', 'Adj.
Volume']]
df['HL_PCT'] = (df['Adj. High'] - df['Adj. Low']) / df['Adj. Close'] * 100.0
df['PCT_change'] = (df['Adj. Close'] - df['Adj. Open']) / df['Adj. Open'] *
100.0
df = df[['Adj. Close', 'HL_PCT', 'PCT_change', 'Adj. Volume']]
forecast_col = 'Adj. Close'
df.fillna(value=-99999, inplace=True)
forecast_out = int(math.ceil(0.01 * len(df)))
df['label'] = df[forecast_col].shift(-forecast_out)
X = np.array(df.drop(['label'], 1))
X = preprocessing.scale(X)
X_lately = X[-forecast_out:]
X = X[:-forecast_out]
df.dropna(inplace=True)
y = np.array(df['label'])
X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y,
test_size=0.2)
clf = LinearRegression(n_jobs=-1)
clf.fit(X_train, y_train)
confidence = clf.score(X_test, y_test)
forecast_set = clf.predict(X_lately)
df['Forecast'] = np.nan
last_date = df.iloc[-1].name
last_unix = last_date.timestamp()
one_day = 86400
next_unix = last_unix + one_day
for i in forecast_set:
next_date = datetime.datetime.fromtimestamp(next_unix)
next_unix += 86400
df.loc[next_date] = [np.nan for _ in range(len(df.columns)-1)]+[i]
df['Adj. Close'].plot()
df['Forecast'].plot()
plt.legend(loc=4)
plt.xlabel('Date')
plt.ylabel('Price')
plt.show()
If you run this code as it is in 3.7 you'll get some errors related to SKlearn which I've been able to solve from advice on SO but once I deal with them i get the error as follows
H:\Documents\Python Scripts>py ML_tutorial_vid_5.1.py
Traceback (most recent call last):
File "ML_tutorial_vid_5.1.py", line 34, in <module>
X_train, X_test, y_train, y_test = cross_validate.train_test_split(X, y,
test_size=0.2)
AttributeError: 'function' object has no attribute 'train_test_split'
All help appreciated.
Upvotes: 0
Views: 564
Reputation: 11937
You are getting this error because train_test_split
is now in model_selection
module of sklearn
. You can see the change log over here.
You can import it like this now.
from sklearn.model_selection import train_test_split
and use it like this
X_train, X_test, y_train, y_test = train_test_split(X, y,
test_size=0.2)
Upvotes: 1