Joao Souza
Joao Souza

Reputation: 3

Is there a way to stack models trained with different data sets with the stacks package in R?

Briefly, I am working with data sets from two different countries. My aim is to ensemble the models for both countries to see how generalizable the ensemble becomes

My set-up is: I have trained one worfklow_set for each country (10 model specifications with resampling and a grid search of size 20).

This is the error I get when trying to add them as candidates:

predictions <- stacks() %>% 
  add_candidates(wf_set_1) %>% 
  add_candidates(wf_set_2)

Error: It seems like the new candidate member 'Logistic Regression' doesn't make use of the same resampling object as the existing candidates.

Upvotes: 0

Views: 366

Answers (1)

Simon Couch
Simon Couch

Reputation: 531

Thanks for the question!

Unfortunately, we don't support ensembling models trained on different data sets in stacks. There are a few operations that are no longer well-defined when this is the case.

Given your description of the problem, though, this sounds like a setting where, rather than fitting a model for each country, the country would be included as a feature in one model that fits across countries. For any covariates x_i whose effect you feel may be dependent on country, you can create an interaction term with step_interact(x_i, country).

Upvotes: 2

Related Questions