Amit Rastogi
Amit Rastogi

Reputation: 958

Feature scaling using python StandardScaler produces negative values

I am a newbie in Machine learning. I am trying to use feature scaling on my input training and test data using the python StandardScaler class. However, when I see the scaled values some of them are negative values even though the input values do not have negative values. Is this normal or am I missing anything in my code. Given below the relevant code being used for feature scaling.

from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
train = sc.fit_transform(train) //train contains training feature matrix
test = sc.transform(test)   //test contains test feature matrix

Upvotes: 5

Views: 9862

Answers (1)

Pavel
Pavel

Reputation: 7562

From the docs:

Standardize features by removing the mean and scaling to unit variance

This means, given an input x, transform it to (x-mean)/std (where all dimensions and operations are well defined).

So even if your input values are all positive, removing the mean can make some of them negative:

>>> x = np.array([3,5,7])
>>> np.mean(x)
5.0
>>> x - np.mean(x)
array([-2.,  0.,  2.])

More details:

Upvotes: 9

Related Questions