Reputation: 3299
Given a 2D array, I would like to normalize it into range 0-1.
I know this can be achieve as below
import numpy as np
from sklearn.preprocessing import normalize,MinMaxScaler
np.random.seed(0)
t_feat=4
t_epoch=3
t_wind=2
result = [np.random.rand(t_epoch, t_feat) for _ in range(t_wind)]
wdw_epoch_feat=np.array(result)
matrix=wdw_epoch_feat[:,:,0]
xmax, xmin = matrix.max(), matrix.min()
x_norm = (matrix - xmin)/(xmax - xmin)
which produce
[[0.55153917 0.42094786 0.98439526], [0.57160496 0. 1. ]]
However, I cannot get the same result using the MinMaxScaler
of sklearn
scaler = MinMaxScaler()
x_norm = scaler.fit_transform(matrix)
which produce
[[0. 1. 0.], [1. 0. 1.]]
Appreciate for any thought
Upvotes: 0
Views: 1937
Reputation: 19312
A clever way to do this would be to reshape your data to 1D, apply transform and reshape it back to original -
import numpy as np
X = np.array([[-1, 2], [-0.5, 6]])
scaler = MinMaxScaler()
X_one_column = X.reshape([-1,1])
result_one_column = scaler.fit_transform(X_one_column)
result = result_one_column.reshape(X.shape)
print(result)
[[ 0. 0.42857143]
[ 0.07142857 1. ]]
Upvotes: 1
Reputation: 1336
You are standardizing the entire matrix. MinMaxScaler is designed for machine learning, thus performs standardization per row or column based on how you define it. To get the same results as you, you would need to turn the 2D array into a 1D array. I show this below and get your same results in the first column:
import numpy as np
from sklearn.preprocessing import normalize, MinMaxScaler
np.random.seed(0)
t_feat=4
t_epoch=3
t_wind=2
result = [np.random.rand(t_epoch, t_feat) for _ in range(t_wind)]
wdw_epoch_feat=np.array(result)
matrix=wdw_epoch_feat[:,:,0]
xmax, xmin = matrix.max(), matrix.min()
x_norm = (matrix - xmin)/(xmax - xmin)
matrix = np.array([matrix.flatten(), np.random.rand(len(matrix.flatten()))]).T
scaler = MinMaxScaler()
test = scaler.fit_transform(matrix)
print(test)
-------------------------------------------
[[0.55153917 0. ]
[0.42094786 0.63123194]
[0.98439526 0.03034732]
[0.57160496 1. ]
[0. 0.48835502]
[1. 0.35865137]]
When you use MinMaxScaler for Machine Learning, you generally want to standardize each column.
Upvotes: 1