Leo
Leo

Reputation: 173

How to normalize a specific dimension of a 3D array

sklearn.preprocessing.normalize only supports a 2D array normalization. However, I currently have a 3D array for LSTM model training (batch, step, features) and I wish to normalize the features.

I have tried tf.keras.utils.normalize(X_train, axis=-1, order=2 ) But it is not correct.

Another way is to fold the 3D array into a 2D array

print(X_train.shape)
print(max(X_train[0][0]))

output

(1883, 100, 68)
6.028588763956215
scaler = StandardScaler()
X_train = scaler.fit_transform(X_train.reshape(X_train.shape[0], -1)).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(X_test.shape[0], -1)).reshape(X_test.shape)
print(X_train.shape)
print(max(X_train[0][0]))
print(min(X_train[0][0]))

output

(1883, 100, 68)
3.2232538993444533
-1.9056918449890343

The value is still not within 1 and -1.

How should I approach it?

Upvotes: 3

Views: 4087

Answers (1)

Marco Cerliani
Marco Cerliani

Reputation: 22031

As suggested in the comments, I provided the answer

you can scale a 3D array with sklearn preprocessing methods. you simply have to reconduct to 2D data to fit them and then reverse back to 3D. This can be done easily with a few lines of code.

if you want the scaled data to be in range (-1,1), you can simply use MinMaxScaler specifying feature_range=(-1,1)

X_train = np.random.uniform(-20,100, (1883, 100, 68))
X_test = np.random.uniform(-20,100, (100, 100, 68))

print(X_train.shape)
print(X_train.min().round(5), X_train.max().round(5)) # -20, 100

scaler = MinMaxScaler(feature_range=(-1,1))
X_train = scaler.fit_transform(X_train.reshape(X_train.shape[0], -1)).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(X_test.shape[0], -1)).reshape(X_test.shape)

print(X_train.shape)
print(X_train.min().round(5), X_train.max().round(5)) # -1, 1

Upvotes: 6

Related Questions