Reut
Reut

Reputation: 1592

ValueError : x and y must be the same size

I have a dataset which i'm trying to calculate Linear regression using sklearn. The dataset i'm using is already made so there are not suppose to be problems with it. I have used train_test_split in order to split my data into train and test groups. When I try to use matplotlib in order to create scatter plot between my ttest and prediction group, I get the next error:

ValueError: x and y must be the same size

This is my code:

y=data['Yearly Amount Spent']
x=data[['Avg. Session Length','Time on App','Time on Website','Length of Membership','Yearly Amount Spent']]
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.3, random_state=101)

#training the model

from sklearn.linear_model import LinearRegression
lm=LinearRegression()
lm.fit(x_train,y_train)
lm.coef_

predictions=lm.predict(X_test)

#here the problem starts:

plt.scatter(y_test,predictions)

Why does this error occurs? I have seen previous posts here and the suggestions for this was to use x.shape and y.shape but i'm not sure what is the purpose of that.

Thanks

Upvotes: 0

Views: 2063

Answers (1)

seralouk
seralouk

Reputation: 33147

It seems that you are using the EcommerceCustomers.csv dataset (link here)

In your original post the column 'Yearly Amount Spent' is also included in the y as well as in x but this is wrong.

The following should work fine:

from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

data = pd.read_csv("EcommerceCustomers.csv")

y = data['Yearly Amount Spent']
X = data[['Avg. Session Length', 'Time on App','Time on Website', 'Length of Membership']]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=101)


# ## Training the Model
lm = LinearRegression()
lm.fit(X_train,y_train)

# The coefficients
print('Coefficients: \n', lm.coef_)

# ## Predicting Test Data
predictions = lm.predict( X_test)

See also this

Upvotes: 1

Related Questions