Reputation: 136
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
dataset = pd.read_csv('Position_Salaries.csv')
X = dataset.iloc[:, 1:2].values
y = dataset.iloc[:, 2].values
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
sc_y = StandardScaler()
X = sc_X.fit_transform(X)
y = sc_y.fit_transform(y)
Ok so here is the problem. both X and y are single feature and have one column. As you can see X is a matrix. and y is a vector X = dataset.iloc[:, 1:2].values y = dataset.iloc[:, 2].values
Now when I run y = sc_y.fit_transform(y)
I get the error that it is a 1D array. And if i change y = dataset.iloc[:, 2:3].values
making it a 2D array.
But I want it to stay as 1D array since its the dependent variable and want it to stay that way. Also i solved earlier different examples where I had to rescale similar data, and it did not give me this kind of error. Not sure why it is giving me now. Moreover i am watching a video while coding and in the video everything is the same but he doesn't get any error.
Upvotes: 5
Views: 21009
Reputation: 21
StandardScaler used to work with 1d arrays but with a DeprecationWarning: Passing 1d arrays as data is deprecated in 0.17 and will raise ValueError in 0.19. Reshape your data either using X.reshape(-1,1) if your data has a single feature or X.reshape(1,-1) if it contains a single sample.
So, following the solution you are looking for:
sc_y = StandardScaler()
y = np.array(y).reshape(-1,1)
y = sc_y.fit_transform(y)
y = y.flatten()
Upvotes: 2
Reputation: 11
from sklearn.preprocessing import StandardScaler
sc_X = StandardScaler()
X = sc_X.fit_transform(X)
sc_y = StandardScaler()
y = np.array(y).reshape(-1,1)
y = sc_y.fit_transform(y)
y = y.flatten()
Upvotes: 1
Reputation: 36599
StandardScaler is meant to work on the features, not labels or target data. Hence only works on 2-d Data. Please see here for documentation:
What you can do is, use scale function. StandardScaler is just a wrapper over this function.
from sklearn.preprocessing import scale
y = scale(y)
Or if you want to use StandarScaler, you need to reshape your y
to a 2-d array like this:
import numpy as np
y = np.array(y).reshape(-1,1)
y = sc_y.fit_transform(y)
Upvotes: 10
Reputation: 918
You can use flatten
to get a 1D array from the 2D array:
y.flatten()
Upvotes: -1