amro_ghoneim
amro_ghoneim

Reputation: 535

MinMaxScaler returns same value for single input

I have a data frame as shown below and I used Scikit-Learn's MinMaxScaler to normalize the values of the path_len column between zero and one.

First three rows of my dataframe before:

feature_1, feature_2, path_len
        0          1        10
        1          1        16
        0          0        117

Code:

from sklearn.preprocessing import MinMaxScaler
min_max = MinMaxScaler()
features['path_len'] = min_max.fit_transform(features[['path_len']])

First three rows of my dataframe after:

feature_1, feature_2, path_len
        0          1  0.033582
        1          1  0.055970
        0          0  0.432836

When I then try to use min_max.transform() on a new input value for path_len, I get the same exact value:

def preprocess_input(link, min_max, features):
 
    df = pd.DataFrame( columns=features.columns)
    df['feature_1'] = ...
    df['feature_2'] = ...
    df['path_len'] = 86 #arbitrary number
    df['path_len'] = min_max.transform(df[['path_len']]) ### right here!
    return df
    

The final value in df['path_len'] is 86 again!

How do I go about solving this?

Upvotes: 1

Views: 1021

Answers (2)

Santiago Echavarria
Santiago Echavarria

Reputation: 11

If you don't want to fit the scaler object again, you can do de following:

scaled_value = (value_in_original_scale - scaler.data_min_[i]) * scaler.scale_[i]

Where i is the column index that you want to transform.

Upvotes: 0

Sergey Bushmanov
Sergey Bushmanov

Reputation: 25199

Change your line:

features['path_len'] = min_max.fit_transform(features[['path_len']])

to:

min_max.fit(features[['path_len']])
features['path_len'] = min_max.transform(features[['path_len']])

and your code will work in full as expected.

Upvotes: 1

Related Questions