Sklearn linearregression.predict() gives a error of size diference

Question

I am trying to make linear regression model that predicts the next number in the list. So it creates a array stores 60 values to x_train and the next one to y_train. And then shifts one up stores 60 values to x_train and the next one to y_train. This repeats until it has done 80% of the dataset. The fit functions works fine, but when i then use the .predict() function with a new list from 60 x values from the dataset it returns this error: ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 60 is different from 1) Which i think comes from a shape issue. I tried to reshape pred_x with: pred_x.reshape(60, 1) But that still gave the same error.

My code:

training_data_len = math.ceil(len(dataset) * .8)

train_data = dataset[0:training_data_len , :]
print(train_data)


x_train = []
y_train = []

for i in range(60, len(train_data)):
    x_train.append(train_data[i-60:i])
    y_train.append(train_data[i])

lr = LinearRegression()

nsamples, nx, ny = np.shape(x_train)
x_train = np.reshape(x_train, (nsamples, nx*ny))

lr.fit(x_train, y_train)

len_data = len(df)

pred_x = df[len_data-61:len_data-1]

pred_x = pred_x.values

prediction = lr.predict(pred_x)

print(prediction)

print(df[0])

piterbarg · Accepted Answer

You need to reshape the data you use in predict to the same number of dimensions as you use in train, with the same shape for your 'feature' dimensions (all but the 'nsamples' dimension). Your train data is of shape (nsamples, nx*ny) with nx*ny=60 as far as I can tell. If you want to predict based on 1 sample, you should reshape pred_x so: pred_x.reshape(1,60)

Sklearn linearregression.predict() gives a error of size diference

Answers (1)

Related Questions