Reputation: 598
I want a "balanced batch sampler" for my machine-learning training without explicitly creating and storing a balanced batch (to save memory).
Initially, I had planned to use
imb_learn.tensorflow.balanced_batch_generator
Where you can get training samples by :
from imblearn.over_sampling import RandomOverSampler
from imblearn.tensorflow import balanced_batch_generator
gen,steps = balanced_batch_generator(xtrain,ytrain,sampler=RandomOverSampler(),batch_size=100)
xsample,ysample = next(gen)
But the drawback here is that the data needs to have well-defined num-features. But my data is of custom datatype where num-features cannot be defined. Are there any other libraries I can use for this? Or can we cheat somehow to use the above library?
Upvotes: 0
Views: 34