Shuai Zhang
Shuai Zhang

Reputation: 2061

How to add oversampling/undersampling procedure in scikit's Pipeline?

I would like to add oversampling procedure, like SMOTE oversampling, to scikit's Pipeline. But the transformers only supports fit and transform method, and do not provide a way to increase the number of samples and targets.

One possible way to do this is to break the pipeline to two separate pipelines connected by SMOTE sampling.

Is there any better solutions?

Upvotes: 6

Views: 4661

Answers (1)

ogrisel
ogrisel

Reputation: 40159

Our current Pipeline does not support changing the number of samples between steps as the Transformer.transform method does not return the y argument that would need to also be resampled. This is a know limitation of the current design. It might be fixed in a future version but we have not started to work on that yet.

Upvotes: 5

Related Questions