Detoke
Detoke

Reputation: 11

Linear Regressor unable to predict a set of values; Error: ValueError: shapes (100,1) and (2,1) not aligned: 1 (dim 1) != 2 (dim 0)

I have 2 numpy arrays:

x= np.linspace(1,10,100) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

I want to train a Linear Regressor using these datasets. To compare the relationship between complexity & generalization, I using h Polynomial features preprocessing for a set of 4 degrees (1, 3, 6, 9). After fitting the model, I want to test on an array x = np.linspace(1, 10, 100)

After much trying, I figured out that the x and y arrays need to be reshaped, and I did that. However, when I create the new x dataset to be predicted, it complains that the dimensions are not aligned. The estimator is working on the test-split from the original x array.

Below is my code

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

np.random.seed(0)
n = 100
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)

def fn_one():
 from sklearn.linear_model import LinearRegression
 from sklearn.preprocessing import PolynomialFeatures

 x_predict = np.linspace(0,10,100)
 x_predict = x_predict.reshape(-1, 1)
 degrees = [1, 3, 6, 9]
 predictions = []

  for i, deg in enumerate(degrees):
    linReg = LinearRegression()
    pf = PolynomialFeatures(degree=deg)
    xt = x.reshape(-1, 1)
    yt = y.reshape(-1, 1)

    X_transformed = pf.fit_transform(xt)
    X_train_transformed, X_test_transformed, y_train_temp, y_test_temp = train_test_split(X_transformed, yt, random_state=0)
    linReg.fit(X_train_transformed, y_train_temp)
    predictions.append(linReg.predict(x_predict))

 np.array(predictions)
 return predictions

The shapes of the different arrays (@ degree 3 in the loop)

x_predict = (100, 1)

xt = 100, 1

yt = 100, 1

X_train_transformed = 75, 4

y_train_temp = 75, 1

X_test_transformed = 25, 4

y_train_temp = 25, 1

predictions for X_test_transformed = 4, 25, 1

predictions for x_predict = Not working:

Error = ValueError: shapes (100,1) and (2,1) not aligned: 1 (dim 1) != 2 (dim 0)

Upvotes: 1

Views: 133

Answers (1)

Parthasarathy Subburaj
Parthasarathy Subburaj

Reputation: 4264

You forgot to transform your x_predict. I have updated your code below:

import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split

np.random.seed(0)
n = 100
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+x/6 + np.random.randn(n)/10

X_train, X_test, y_train, y_test = train_test_split(x, y, random_state=0)

def fn_one():
 from sklearn.linear_model import LinearRegression
 from sklearn.preprocessing import PolynomialFeatures

 x_predict = np.linspace(0,10,100)
 x_predict = x_predict.reshape(-1, 1)
 degrees = [1, 3, 6, 9]
 predictions = []

  for i, deg in enumerate(degrees):
    linReg = LinearRegression()
    pf = PolynomialFeatures(degree=deg)
    xt = x.reshape(-1, 1)
    yt = y.reshape(-1, 1)

    X_transformed = pf.fit_transform(xt)
    X_train_transformed, X_test_transformed, y_train_temp, y_test_temp = train_test_split(X_transformed, yt, random_state=0)
    linReg.fit(X_train_transformed, y_train_temp)
    x_predict_transformed = pf.fit_transform(x_predict)
    predictions.append(linReg.predict(x_predict_transformed))

 np.array(predictions)
 return predictions

And now when you call fn_one() you will get the predictions.

Upvotes: 1

Related Questions