bas.cmd
bas.cmd

Reputation: 45

Sklearn linearregression.predict() gives a error of size diference

I am trying to make linear regression model that predicts the next number in the list. So it creates a array stores 60 values to x_train and the next one to y_train. And then shifts one up stores 60 values to x_train and the next one to y_train. This repeats until it has done 80% of the dataset. The fit functions works fine, but when i then use the .predict() function with a new list from 60 x values from the dataset it returns this error: ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 60 is different from 1) Which i think comes from a shape issue. I tried to reshape pred_x with: pred_x.reshape(60, 1) But that still gave the same error.

My code:

training_data_len = math.ceil(len(dataset) * .8)

train_data = dataset[0:training_data_len , :]
print(train_data)


x_train = []
y_train = []

for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i])
    y_train.append(train_data[i])

lr = LinearRegression()

nsamples, nx, ny = np.shape(x_train)
x_train = np.reshape(x_train, (nsamples, nx*ny))

lr.fit(x_train, y_train)

len_data = len(df)

pred_x = df[len_data-61:len_data-1]

pred_x = pred_x.values

prediction = lr.predict(pred_x)

print(prediction)

print(df[0])

Upvotes: 1

Views: 846

Answers (1)

piterbarg
piterbarg

Reputation: 8219

You need to reshape the data you use in predict to the same number of dimensions as you use in train, with the same shape for your 'feature' dimensions (all but the 'nsamples' dimension). Your train data is of shape (nsamples, nx*ny) with nx*ny=60 as far as I can tell. If you want to predict based on 1 sample, you should reshape pred_x so: pred_x.reshape(1,60)

Upvotes: 1

Related Questions