Apply mlr3 pipes on group by basis

Question

I would like to know is it possible to apply mlr3 Pipe processing on groupBy basis.

For example, from the mlr3pipelines documentation, we can scale predictors with following code:

library(mlr3)
library(mlr3pipelines)
task = tsk("iris")
pop = po("scalemaxabs")
pop$train(list(task))[[1]]$data()

But, is it possible to do scaling by group. For example, lets add month columns to iris data:

library(mlr3)
library(mlr3pipelines)
task = tsk("iris")
dt = task$data()
dt[, month := c(rep(1, 50), rep(2, 50), rep(3, 50))]
task = as_task_classif(dt, target = "Species", id = "iris")

Is it possible to scale predictors by month column? That is, we want to scale every month separately. Using data.table, this is easy:

task$data()[, lapply(.SD, function(x) as.vector(scale(x))), .SDcols = names(DT)[2:5], by = month]

but is it possible to do this inside the mlr3pipe graph?

Apply mlr3 pipes on group by basis

Answers (1)

Related Questions