Reputation: 2978
I have the following pandas data frame df
:
COL1 COL2 COL3
0.0 -258.0 A
0.0 -262.2 A
0.0 -210.0 C
0.0 -84.0 B
0.0 -237.0 A
0.0 -277.2 B
0.0 -273.0 A
0.0 15.0 B
0.0 21.0 C
0.0 -61.8 C
I want to apply RobustScaler
to numerical features COL1
and COL2
:
scaler = preprocessing.RobustScaler(quantile_range = (0.0,0.9))
scaler.fit(df_subset[["COL1","COL2"]])
df[["COL1","COL2"]] = pd.DataFrame(scaler.transform(df[["COL1","COL2"]]), columns=["COL1","COL2"])
However, when I check the result, I see all NaN
values in COL1
and COL2
:
df[["COL1","COL2"]]
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
NaN NaN
Upvotes: 0
Views: 1109
Reputation: 743
Are you sure that you are not missing something else? Because I've run your code with your dataset and it worked well.
Before Scaler
COL1 COL2 COL3
0 0.0 -258.0 A
1 0.0 -262.2 A
2 0.0 -210.0 C
3 0.0 -84.0 B
4 0.0 -237.0 A
5 0.0 -277.2 B
6 0.0 -273.0 A
7 0.0 15.0 B
8 0.0 21.0 C
9 0.0 -61.8 C
The code I've run:
scaler = preprocessing.RobustScaler(quantile_range=(0.0, 0.9))
scaler.fit(df[["COL1", "COL2"]])
df[["COL1", "COL2"]] = pd.DataFrame(scaler.transform(df[["COL1", "COL2"]]), columns=["COL1", "COL2"])
Output
COL1 COL2 COL3
0 0.0 -101.410935 A
1 0.0 -113.756614 A
2 0.0 39.682540 C
3 0.0 410.052910 B
4 0.0 -39.682540 A
5 0.0 -157.848325 B
6 0.0 -145.502646 A
7 0.0 701.058201 B
8 0.0 718.694885 C
9 0.0 475.308642 C
Update
scaler = preprocessing.RobustScaler(quantile_range=(0.0, 0.9))
scaler.fit(df[['COL1', 'COL2']])
df[['COL1', 'COL2']] = scaler.transform(df[['COL1', 'COL2']])
print(df[['COL1', 'COL2']])
Upvotes: 1