Reputation: 35
I am working on an imbalanced dataset with a two-class response variable using mlr3. I want to apply SMOTE method to oversample the minority. I learned that this method should be used only on the training set, not on the test set. However, if I do not misunderstand, the mlr3 pipeline manipulates the whole dataset before setting a task during which the dataset is splitted into the training and test sets. I wonder how to apply the SMOTE method (mlr_pipeops_smote
) only on the training set?
Upvotes: 0
Views: 229
Reputation: 109232
It is automatically only applied on the training set; see the documentation:
The output during prediction is the unchanged input.
Upvotes: 2