user24804654
user24804654

Reputation: 1

MLR3 Pipeline: Access Data of PipeOP SMOTE in a Resample-/Benchmark-Structure

I would like to be able to view the data generated with the PipeOp SMOTE. When I generate a graph or convert the graph to a GraphLearner, it works. However, if I generate a resample structure (or a benchmark structure), I can apparently only see the data of the respective resample fraction? Maybe I'm just misunderstanding something here, but I would be very grateful for a hint. A miminal reproducible example is below. Thank you!

# minimal reproducible example

# as Graph
# Create example task
data = smotefamily::sample_generator(1000, ratio = 0.80)
data$result = factor(data$result)
task = TaskClassif$new(id = "example", backend = data, target = "result")
task$data()
table(task$data()$result)

# Create graph & Train:
gr = Graph$new() 
gr$add_pipeop(po("smote", dup_size = 140))
gr$add_pipeop(mlr_pipeops$get("learner", lrn("classif.ranger", predict_type = "prob")))
gr$add_edge("smote", "classif.ranger")
print(gr)
gr$plot()
gr$keep_results=TRUE 
gr$train(task) 

# Access result of pipeop "smote" of this graph:
gr$pipeops$smote$.result$output$data() # -> is working, additional rows are shown

# Now: Convert graph to learner and access data of the smote pipeop
lr = as_learner(gr)
lr$train(task)
lr$graph_model$pipeops$smote$.result$output$data() # -> is working too, additional rows are shown

# Now: Create a resample structure & train & try to access data of the smote pipeop:
r = rsmp("cv", folds = 2L)
rr=resample(task,lr,r, store_models = TRUE, store_backends = TRUE)
rr$learners[[1]]$graph_model$pipeops$smote$.result$output$data() # -> doesn't work: only the rows of the resample fraction are shown...?

Upvotes: 0

Views: 39

Answers (0)

Related Questions