Reputation: 495
I am trying to build a workflow_set
using tidymodels
to compare models that are tuned using the bayesian approach. I am not sure how to map the parameters that need to be tuned into the workflow_set
. I am getting the following error
"Error in dials::grid_latin_hypercube(param, size = n) : \n These arguments contains unknowns: `mtry`. See the `finalize()` function.\n"
It looks like I need to supply initial values when not using a workflow_set
I would do
xgb_wflow <-
tree_frogs_wflow %>%
add_model(xgb_spec)
xgb_params <- extract_parameter_set_dials(xgb_spec)%>%
finalize(tree_frogs_train)
xgb_res <-
tune_bayes(
object = xgb_wflow,
resamples = folds,
param_info=xgb_params,
metrics=metric_set(pr_auc,roc_auc,accuracy),
initial = 10,
control = ctrl_bayes)
not sure how to do this using workflow_sets
. My attempt is below any help is appreciated.
library(tidymodels)
library(tidyverse)
library(stacks)
data("tree_frogs")
tree_frogs <- tree_frogs %>%
select(-c(clutch, latency))
set.seed(1)
tree_frogs_split <- initial_split(tree_frogs)
tree_frogs_train <- training(tree_frogs_split)
tree_frogs_test <- testing(tree_frogs_split)
tree_frogs_rec <-
recipe(reflex ~ ., data = tree_frogs_train) %>%
step_dummy(all_nominal(), -reflex) %>%
step_zv(all_predictors())
set.seed(1)
folds <- rsample::vfold_cv(tree_frogs_train, v = 5)
## multinomial
mlt_spec <-
multinom_reg(penalty = tune(), mixture = 1) %>%
set_engine("glmnet") %>%
set_mode("classification")
## Random Forest
rand_forest_spec <-
rand_forest(
mtry = tune(),
min_n = tune(),
trees = 500
) %>%
set_mode("classification") %>%
set_engine("ranger")
## XGBoost
xgb_spec <-
boost_tree(
trees = 1000,
min_n = tune(),
learn_rate = tune(),
loss_reduction = tune(),
sample_size = tune(),
mtry = tune(),
tree_depth = tune()
) %>%
set_engine("xgboost") %>%
set_mode("classification")
all_workflows <-
workflow_set(
preproc = list("basic_rec" = tree_frogs_rec),
models = list(mlt = mlt_spec, rf = rand_forest_spec, xgb = xgb_spec)
)
ctrl_bayes = control_bayes(save_pred = TRUE,
parallel_over = "everything",
save_workflow = TRUE)
bayes_results <-
all_workflows %>%
workflow_map(
seed = 1,
fn = "tune_bayes",
resamples = folds,
initial = 10,
control = ctrl_bayes
)
bayes_results %>%
rank_results() %>%
filter(.metric == "pr_auc") %>%
select(model, .config, pr_auc = mean, rank)
Upvotes: 0
Views: 416
Reputation: 495
I found the answer in the documentation using the options_add
function
I added the code below before the workflow_map
mlt_params <- extract_parameter_set_dials(mlt_spec)%>%
finalize(tree_frogs_train)
xgb_params <- extract_parameter_set_dials(xgb_spec)%>%
finalize(tree_frogs_train)
rf_params <- extract_parameter_set_dials(rand_forest_spec)%>%
finalize(tree_frogs_train)
all_workflows = all_workflows %>%
option_add(param_info = mlt_params,id = "basic_rec_mlt")
all_workflows = all_workflows %>%
option_add(param_info = rf_params,id = "basic_rec_rf")
all_workflows = all_workflows %>%
option_add(param_info = xgb_params,id = "basic_rec_xgb")
Upvotes: 2