sklearn StandardScaler outputting wrong matrix

Question

[10 15  18  11]
[15 17  24  16]
[13 13  20  14]
[12 20  10  25]
[12 11  14  11]

I have this data, and I'm trying to scale it using sklearn.preprocessing.StandardScaler:

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()

scaled=scaler.fit_transform(df)
scaled_df = pd.DataFrame(scaled, columns = df.columns)
scaled_df.head()

This outputs:

array([[-1.32680694, -0.06401844,  0.16552118, -0.85248268],
       [ 1.73505523,  0.57616596,  1.40693001,  0.11624764],
       [ 0.20412415, -0.70420284,  0.57932412, -0.27124449],
       [-0.30618622,  1.53644256, -1.4896906 ,  1.85996222],
       [-0.30618622, -1.34438724, -0.66208471, -0.85248268]])

I know this is wrong since the cov matrix shows variance as 1.25, when by definition it should be 1. Also, the original data is correctly saved in the 'df' variable. If I standarize the data manually I get the correct result, so I really don't know what's going on with the scaler function.

sklearn StandardScaler outputting wrong matrix

Answers (1)

Related Questions