Fluxy
Fluxy

Reputation: 2978

Scaling of features produces all NaN values

I have the following pandas data frame df:

COL1  COL2     COL3
0.0   -258.0   A
0.0   -262.2   A
0.0   -210.0   C
0.0   -84.0    B
0.0   -237.0   A
0.0   -277.2   B
0.0   -273.0   A
0.0   15.0     B
0.0   21.0     C
0.0   -61.8    C

I want to apply RobustScaler to numerical features COL1 and COL2:

scaler = preprocessing.RobustScaler(quantile_range = (0.0,0.9))
scaler.fit(df_subset[["COL1","COL2"]])
df[["COL1","COL2"]] = pd.DataFrame(scaler.transform(df[["COL1","COL2"]]), columns=["COL1","COL2"])

However, when I check the result, I see all NaN values in COL1 and COL2:

df[["COL1","COL2"]]

NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN

Upvotes: 0

Views: 1109

Answers (1)

talatccan
talatccan

Reputation: 743

Are you sure that you are not missing something else? Because I've run your code with your dataset and it worked well.

Before Scaler

  COL1    COL2 COL3
0  0.0  -258.0    A
1  0.0  -262.2    A
2  0.0  -210.0    C
3  0.0   -84.0    B
4  0.0  -237.0    A
5  0.0  -277.2    B
6  0.0  -273.0    A
7  0.0    15.0    B
8  0.0    21.0    C
9  0.0   -61.8    C

The code I've run:

scaler = preprocessing.RobustScaler(quantile_range=(0.0, 0.9))
scaler.fit(df[["COL1", "COL2"]])
df[["COL1", "COL2"]] = pd.DataFrame(scaler.transform(df[["COL1", "COL2"]]), columns=["COL1", "COL2"])

Output

   COL1        COL2 COL3
0   0.0 -101.410935    A
1   0.0 -113.756614    A
2   0.0   39.682540    C
3   0.0  410.052910    B
4   0.0  -39.682540    A
5   0.0 -157.848325    B
6   0.0 -145.502646    A
7   0.0  701.058201    B
8   0.0  718.694885    C
9   0.0  475.308642    C

Update

scaler = preprocessing.RobustScaler(quantile_range=(0.0, 0.9))
scaler.fit(df[['COL1', 'COL2']])
df[['COL1', 'COL2']] = scaler.transform(df[['COL1', 'COL2']])
print(df[['COL1', 'COL2']])

Upvotes: 1

Related Questions