How to apply pipeline_smote just on training set in mlr3pipelines?

Question

I am working on an imbalanced dataset with a two-class response variable using mlr3. I want to apply SMOTE method to oversample the minority. I learned that this method should be used only on the training set, not on the test set. However, if I do not misunderstand, the mlr3 pipeline manipulates the whole dataset before setting a task during which the dataset is splitted into the training and test sets. I wonder how to apply the SMOTE method (mlr_pipeops_smote) only on the training set?

Lars Kotthoff · Accepted Answer

It is automatically only applied on the training set; see the documentation:

The output during prediction is the unchanged input.

How to apply pipeline_smote just on training set in mlr3pipelines?

Answers (1)

Related Questions