Yiwei Zhu
Yiwei Zhu

Reputation: 35

How to apply pipeline_smote just on training set in mlr3pipelines?

I am working on an imbalanced dataset with a two-class response variable using mlr3. I want to apply SMOTE method to oversample the minority. I learned that this method should be used only on the training set, not on the test set. However, if I do not misunderstand, the mlr3 pipeline manipulates the whole dataset before setting a task during which the dataset is splitted into the training and test sets. I wonder how to apply the SMOTE method (mlr_pipeops_smote) only on the training set?

Upvotes: 0

Views: 229

Answers (1)

Lars Kotthoff
Lars Kotthoff

Reputation: 109232

It is automatically only applied on the training set; see the documentation:

The output during prediction is the unchanged input.

Upvotes: 2

Related Questions