Reputation: 958
I am a newbie in Machine learning. I am trying to use feature scaling on my input training and test data using the python StandardScaler class. However, when I see the scaled values some of them are negative values even though the input values do not have negative values. Is this normal or am I missing anything in my code. Given below the relevant code being used for feature scaling.
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
train = sc.fit_transform(train) //train contains training feature matrix
test = sc.transform(test) //test contains test feature matrix
Upvotes: 5
Views: 9862
Reputation: 7562
From the docs:
Standardize features by removing the mean and scaling to unit variance
This means, given an input x
, transform it to (x-mean)/std
(where all dimensions and operations are well defined).
So even if your input values are all positive, removing the mean can make some of them negative:
>>> x = np.array([3,5,7])
>>> np.mean(x)
5.0
>>> x - np.mean(x)
array([-2., 0., 2.])
More details:
Upvotes: 9