Reputation: 2189
For one of my datasets, I have a data imbalance problem as the minority class has very few samples compared to the majority class. So I want to balance the data by undersampling the majority class. When I am trying to use RandomUnderSamples from imblearn package on a 3D array and I have an error
ValueError: Found array with dim 3. Estimator expected <= 2.
The features in the data which are in 3D format
train['X'].shape
(276216, 101, 4)
The input labels
train['y'].shape
(276216, 1)
When I try to randomly undersample data when I run this
from imblearn.under_sampling import RandomUnderSampler
undersample = RandomUnderSampler(sampling_strategy='majority')
X_train_under, y_train_under = undersample.fit(train['X'], train['y'])
I get the above error. Any help would be appreciated.
Upvotes: 2
Views: 301
Reputation: 2291
The function expects 2D arrays to be passed as arguments. Reshape your data and you'll be fine. Also, you will have to call fit_resample
as per docs.
X = train['X'].reshape(train['X'].shape[0], -1)
X_train_under, y_train_under = undersample.fit_resample(X, train['y'])
Upvotes: 3