Reputation:
Consider the following python3 example where we use MinMaxScaler from scikit-learn to normalize a range of numbers, and then de-normalized them back to their original values.
from pandas import Series
from sklearn.preprocessing import MinMaxScaler
data = [10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
series = Series(data)
values = series.values
values = values.reshape((len(values), 1))
scaler = MinMaxScaler(feature_range=(0,1))
scaler = scaler.fit(values)
normalized = scaler.transform(values)
inversed = scaler.inverse_transform(normalized)
One would expect that inversed
equals values
. Alas:
>>> inversed == values
array([[ True],
[ True],
[False],
[ True],
[ True],
[False],
[ True],
[ True],
[ True],
[ True]])
>>> print(values)
[[ 10.]
[ 20.]
[ 30.]
[ 40.]
[ 50.]
[ 60.]
[ 70.]
[ 80.]
[ 90.]
[100.]]
>>> print(inversed)
[[ 10.]
[ 20.]
[ 30.]
[ 40.]
[ 50.]
[ 60.]
[ 70.]
[ 80.]
[ 90.]
[100.]]
What's happening here? Why is inversed[2]
and inversed[5]
unequal to values[2]
and values[5]
?
Upvotes: 2
Views: 2172
Reputation: 5437
You are comparing two floats with equals. This might yield unexpected results due to rounding differences I would assume. Could you provide the bit patterns of the floating point values? You also might want to check the floating point guide.
As @CoMartel suggested, you can observe the difference between values and inverted (I haven't found a bit representation of floats in numpy, so far), you obtain
values - inversed
array([[ 0.00000000e+00],
[ 0.00000000e+00],
[-3.55271368e-15],
[ 0.00000000e+00],
[ 0.00000000e+00],
[-7.10542736e-15],
[ 0.00000000e+00],
[ 0.00000000e+00],
[ 0.00000000e+00],
[ 0.00000000e+00]])
and you see that the corresponding values can not be the same
Upvotes: 1