Reputation: 535
I have a data frame as shown below and I used Scikit-Learn's MinMaxScaler
to normalize the values of the path_len
column between zero and one.
First three rows of my dataframe before:
feature_1, feature_2, path_len
0 1 10
1 1 16
0 0 117
Code:
from sklearn.preprocessing import MinMaxScaler
min_max = MinMaxScaler()
features['path_len'] = min_max.fit_transform(features[['path_len']])
First three rows of my dataframe after:
feature_1, feature_2, path_len
0 1 0.033582
1 1 0.055970
0 0 0.432836
When I then try to use min_max.transform()
on a new input value for path_len
, I get the same exact value:
def preprocess_input(link, min_max, features):
df = pd.DataFrame( columns=features.columns)
df['feature_1'] = ...
df['feature_2'] = ...
df['path_len'] = 86 #arbitrary number
df['path_len'] = min_max.transform(df[['path_len']]) ### right here!
return df
The final value in df['path_len']
is 86
again!
How do I go about solving this?
Upvotes: 1
Views: 1021
Reputation: 11
If you don't want to fit the scaler
object again, you can do de following:
scaled_value = (value_in_original_scale - scaler.data_min_[i]) * scaler.scale_[i]
Where i
is the column index that you want to transform.
Upvotes: 0
Reputation: 25199
Change your line:
features['path_len'] = min_max.fit_transform(features[['path_len']])
to:
min_max.fit(features[['path_len']])
features['path_len'] = min_max.transform(features[['path_len']])
and your code will work in full as expected.
Upvotes: 1