Anurag Gupta
Anurag Gupta

Reputation: 21

Can anyone explain me the difference between sklearn preprocessing.normalise and MinMaxScaler()

I was following the sklearn documentation and was able to figure out MinMaxScaler(), but what sklearn.preprocessing.normalise does? Can anyone explain me with a simple example.Thanks in advance.

Upvotes: 0

Views: 417

Answers (1)

LaSul
LaSul

Reputation: 2411

  • The Normalizer will process each row to rescale them to the unit circle, e.g. :

    The sum of square data will be equals to 1.

So,

X = [4, 1, 2, 2]

transformer = Normalizer().fit(X)

# Returns
Normalizer(copy=True, norm='l2')

# Then when you transform you 
transformer.transform(X)
# Returns
array([0.8, 0.2, 0.4, 0.4])

To verify what I said, you can verify that the sum square is equal to one :

0.8^2 + 0.2^2 + 0.4^2 + 0.4^2 = 1

  • The MinMaxScaler uses the max and min of a column to scale data between 0 and 1 with the following formula :

    X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0))

    X_scaled = X_std * (max - min) + min

where min, max = feature_range

Taking the same example :

# feature_range = 0, 1 if you want to scale it between 0 and 1

X_std = [1, 0, 0.333, 0.333]
X_scaled = X_std * (1 - 0) + 0
# So X_scaled = X_std for this range

So your MinMaxScaled is X_scaled = [1, 0, 0.333, 0.333]

Taking another example, you can check the maths :

data = [[-1, 2], [-0.5, 6], [0, 10], [1, 18]]
scaler = MinMaxScaler()
print(scaler.fit(data))
# 
MinMaxScaler(copy=True, feature_range=(0, 1))
print(scaler.data_max_)
[ 1. 18.]
print(scaler.transform(data))
[[0.   0.  ]
 [0.25 0.25]
 [0.5  0.5 ]
 [1.   1.  ]]

Upvotes: 2

Related Questions