Marine
Marine

Reputation: 87

Creating a learner object for Bayesian optimization using the mlr and mlrMBO packages: example with a neural network model using the nnet package in R

I aim to build species distribution models (SDMs) using various machine learning models to evaluate the relationships between species presence/absence (encoded as a binary variable 1/0, where the values of 1 represent species presence) and environmental variables.

Currently, I’m conducting Bayesian optimization to tune hyperparameters of machine learning models using the mlrMBO package in R. However, the model predictions are inaccurate. For example, the predicted values range from 0.8 to 1 for the neural network model, whereas I expected a range between 0 and 1. I’m not well-versed in Bayesian optimization, and I suspect that the issue may stem from this process.

I believe there is an inconsistency between my optimization function (see below) and the makeLearner function. Specifically, I converted the binary response variable "presence" (0/1) into a factor in the optimization function, and in the makeLearner function, I specified regr.nnet. Although I attempted to use classif.nnet for the surrogate model, I encountered the following error message:

Error in checkStuff(fun, design, learner, control) : mbo requires regression
   learner! 

I think that I must integrate this line of code but I don't know where and how to integrate it in my code:

test <- makeClassifTask(data = train_data, target="presence")

Any help on Bayesian optimization would be greatly appreciated as I believe I am doing things incorrectly

Below is the relevant code (only essential commands are included for clarity because the optimization function is long). I added a comment at each line of code.

## Create an optimization function
create_optimization_function <- function(x){
  
  ## Create a list containing the performance tables for each model replication
  performance_tables_by_replication <- list()
  
  ## Loop over the model replications
  for(replication_ID in 1:nb_replications){
    
    ## Split the data into training and testing sets
    ...
    train_data <- train_test_split$train
    test_data <- train_test_split$test
    
    ## Create a list containing the performance tables for each data partition
    performance_tables_by_partition <- list()
    
    ## Loop over the data partitions
    for(partition_ID in 1:nb_partitions){
      
      ## Run the model using the training data
      set.seed(1)
      train_data <- train_data %>%
        dplyr::mutate_at(response, as.factor)
      mod <- nnet::nnet(formula = model_formula,
                        data = train_data,
                        size = x["size"],
                        rang = x["rang"],
                        decay = x["decay"],
                        MaxNWts = 1*(10*(length(c(nb_predictors)) + 1) + 10 + 1), trace = FALSE)
      
      ## Compute spatial predictions using the testing data
      ...
      
      ## Create a performance table
      ...
      
      ## Add the performance table into the list "performance_tables_by_partition"
      ...
      
    }
    
    ## Combine all performance tables from the list "performance_tables_by_partition"
    ...
    
    ## Add the performance tables into the list "performance_tables_by_replication"
    ...
    
  }
  
  ## Combine all performance tables from the list "performance_tables_by_replication"
  
  ## Compute the mean of performance metric values
  
  ## Retrieve the maximum value for the hyperparameter selection metric (TSS)
  return(max_TSS)
  
}

  ## Build hyperparameter combinations
  hyperparameter_combinations <-  
     ParamHelpers::makeParamSet(ParamHelpers::makeIntegerParam(
        "size", lower = 1, upper = 10),                                                                     
      ParamHelpers::makeNumericParam("rang", lower = 0.1, upper = 1),                                                                     
      ParamHelpers::makeNumericParam("decay", lower = 0, upper = 2))

  ## Create an objective function
  objective_function <- smoof::makeSingleObjectiveFunction(name = 
     "model_tuning", fn = create_optimization_function, 
     par.set = hyperparameter_combinations, minimize = FALSE)
  
  ## Generate a random Latin hypercube design
  random_LH_design <- ParamHelpers::generateDesign(n = 100, 
    par.set = ParamHelpers::getParamSet(objective_function), 
                                        fun = randomLHS)
  
  ## Define the control parameters for Bayesian optimization
  MBO_control_parameters <- makeMBOControl()
  MBO_control_parameters <- setMBOControlTermination(MBO_control_parameters,
                                                     iters = 10)
  MBO_control_parameters <- setMBOControlInfill(MBO_control_parameters, 
      crit = makeMBOInfillCritMeanResponse())
  
  ## Define the surrogate model
  surrogate_model <- mlr::makeLearner("regr.nnet", 
              predict.type = "response")
  
  ## Start the optimization process
  MBO_results <- mlrMBO::mbo(fun = objective_function, 
       design = random_LH_design, learner = surrogate_model, 
       control = MBO_control_parameters)

UPDATE: Following Lars’ comment, I have modified my code using the mlr3 package. As a reminder, my objective is to tune hyperparameters of several machine learning models in order to find the best combination of hyperparameters and to map spatial predictions. In my code below, I used a non-spatial resampling method because I had performed a filtering process on raw data to reduce spatial autocorrelation, and I used nested resampling to reduce bias in the estimation of model performance. However, I encountered one error when I use the following code line to run the auto_tuner function:

search_space_nnet <- paradox::generate_design_lhs(search_space_nnet, n = 100)

Error in UseMethod("as_search_space") : 
  no applicable method for 'as_search_space' applied to an object of class c('Design', 'R6')

Here is a reproducible example:

tsk_sonar = tsk("sonar")

search_space <- paradox::ps(cost  = paradox::p_dbl(lower = 1e-1, upper = 1e5), gamma = paradox::p_dbl(lower = 1e-1, upper = 1))
search_space <- paradox::generate_design_lhs(search_space, n = 100)

at = mlr3tuning::auto_tuner(tuner = mlr3tuning::tnr("mbo"),
                            learner = mlr3::lrn("classif.svm", predict_type = "prob", kernel = "radial", type = "C-classification"),
                            resampling = mlr3::rsmp("cv", folds = 4),
                            measure = mlr3::msr("classif.auc"),
                            search_space = search_space,
                            terminator = mlr3tuning::trm("evals", n_evals = 20))

rr = mlr3::resample(tsk_sonar, at, rsmp("cv", folds = 3), store_models = TRUE) 

Here is my code:

## Build a spatio-temporal classification task
sp_task = mlr3spatial::as_task_classif_st(x = dat, target = "presence", positive = "1", coordinate_names = c("decimalLongitude", "decimalLatitude"), crs = "+proj=longlat +datum=WGS84 +no_defs +type=crs")
## summary(sp_task)
## NB: dat is the data frame containing raw data

## Split the task into training and testing sets
partitioned_data = partition(sp_task)
## print(partitioned_data)

## Define the arguments for the tuning process
## Define search spaces for each algorithm type
search_space_nnet <- paradox::ps(size = p_int(lower = 1, upper = 10), rang = p_dbl(lower = 0.1, upper = 1), decay = p_dbl(lower = 0, upper = 2))
search_space_nnet <- paradox::generate_design_lhs(search_space_nnet, n = 100)
## search_space_nnet$print()

## Create learners
learner_nnet = lrn("classif.nnet", predict_type = "prob")

## Define a measure
measure <- msr("classif.auc")

## Create a terminator
terminator <- trm("evals", n_evals = 20) ## trm("none"), trm("run_time", secs = 900) 

## Define a tuner
tuner <- tnr("mbo")

## Define a resampling used for the inner loop
inner_resampling <- rsmp ("cv", folds = 2)

## Define a resampling used for the outer loop
outer_resampling <- rsmp ("cv", folds = 2)

## Run the tuning process
set.seed(1234)
at = mlr3tuning::auto_tuner(tuner = tuner, 
                            learner = learner_nnet,
                            resampling = inner_resampling,
                            measure = measure,
                            terminator = terminator,
                            search_space = search_space_nnet)
rr = resample(sp_task, at, outer_resampling, store_models = TRUE)
## extract_inner_tuning_archives(rr)
## extract_inner_tuning_results(rr)
## rr$aggregate()

Upvotes: 4

Views: 265

Answers (0)

Related Questions