Euler_Salter
Euler_Salter

Reputation: 3561

MinMaxScaler sklearn: should I normalize class labels too?

I am using MLPRegressor which takes 5 continuous features and 1 feature which draws values from a set of 40 values [0,1,2,.., 39].

I was told that normalizing the features using sklearn.preprocessing.MinMaxScaler(feature_range = (0,1)) can help with performance, both with MLP and LSTMs.

Thus I am using it on my Xtrain matrix containing the features above.

However, it looks weird to me that I should be minimizing also a categorical variable.. should I do it? The documentation says that (http://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.MinMaxScaler.html) MInMaxScaler normalizes each feature separately. Should I take away the categorical column and normalize all the others?

Also, if it normalizes each feature separately, how does it know how to transform them back when I use inverse_transform?

Upvotes: 3

Views: 7270

Answers (3)

vipin bansal
vipin bansal

Reputation: 896

Categorical feature should be presented as OneHotEncoding. Still if you perform normalization of categorical feature, it will not harm your data. It just convert your data from one form to another form and keep the value discreteness. Please find below small code example:

from sklearn.preprocessing import OneHotEncoder, MinMaxScaler
data = np.array([-2,-2,-78,-78,-1,-1,0,0,1,1])
scaler = MinMaxScaler(feature_range=(0,1))
normalizedData = scaler.fit_transform(data.reshape(-1,1))
encoder = OneHotEncoder(categories='auto',sparse=False)
encodedData = encoder.fit_transform(normalizedData.reshape(-1,1))
print(encodedData)

O/P after OneHotEncoding:

[[0. 1. 0. 0. 0.]
 [0. 1. 0. 0. 0.]
 [1. 0. 0. 0. 0.]
 [1. 0. 0. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 1. 0. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 1. 0.]
 [0. 0. 0. 0. 1.]
 [0. 0. 0. 0. 1.]]

And O/P will remain same even in case if I directly feed the data to encoder i.e. without normalizing.

Upvotes: 3

yoav_aaa
yoav_aaa

Reputation: 387

Scaling categorical variables is unnecessary since there is no natural sense of Metric in these type of variables space.

The second answer - the MinMaxScaler object keeps scale_, data_range_, data_min_ data_max_ after fitted to data(arrays in length of normalized variable).

This attributes enable the inverse transformation per each feature.

Upvotes: 0

E P
E P

Reputation: 146

Categorical variables should be handled accordingly, i.e. with one-hot encoding

After that the MinMax scaler would not really change the encoded features.

Answering your last question - the scaler simply stores minima ans maxima for each input feacture separatley, so it can make inverse transform. And it makes sense to scale features independently - they may be of different scale AND even nature.

Upvotes: 2

Related Questions