Charles Green
Charles Green

Reputation: 11

Question about making prediction code at the end of Section 4.1.6 in Applied Machine Learning Using mlr3 in R

In going through the online mlr3 book Applied Machine Learning Using mlr3 in R (https://mlr3book.mlr-org.com/chapters/chapter4/hyperparameter_optimization.html), I am having a bit of difficulty figuring out how to make sure that the hyper-parameters are optimized only on the training data and that subsequent prediction occurs only on the test data. This is the code and initial error. Note that after introducing this code the chapter moves onto the use of the auto_tune command to do this, but for my purposes I need to do it manually here.

Code

library(mlr3tuning)
library(mlr3tuningspaces)
library(mlr3learners)
library(mlr3extralearners)
library(e1071)
library(paradox)



#Specifying Task
tsk_sonar = tsk("sonar")
tsk_sonar$set_col_roles("Class", c("target", "stratum"))

#Partitioning Data set into Train and Test Samples
splits = mlr3::partition(tsk_sonar, ratio = 0.80)

#Defining Learner and range of hyperparameters for optimization
learner = lrn("classif.svm",
  cost  = to_tune(1e-5, 1e5, logscale = TRUE),
  gamma = to_tune(1e-5, 1e5, logscale = TRUE),
  kernel = "radial",
  type = "C-classification"
)

#Specifying the rows constituting the training data set for the learner
learner$train(tsk_sonar, row_ids = splits$train)

Error Message

> learner$train(tsk_sonar, row_ids = splits$train)
Error in svm.default(x = data, y = task$truth(), probability = (self$predict_type ==  : 
  'list' object cannot be coerced to type 'double'

Continued Code

#Specifying Tuning Instance
instance = ti(
  task = tsk_sonar,
  learner = learner,
  resampling = rsmp("cv", folds = 3),
  measures = msr("classif.ce"),
  terminator = trm("none")
)

# Defining Hyperparamter Search
tuner = tnr("grid_search", resolution = 5, batch_size = 10)

#Running hyperparameter tuning for optimization 
tuner$optimize(instance)

#Training the data on the full data set
lrn_svm_tuned = lrn("classif.svm")
lrn_svm_tuned$param_set$values = instance$result_learner_param_vals

#Final trained model for use in prediction
lrn_svm_tuned$train(tsk_sonar)$model

#Create predictions on the test data
prediction = lrn_svm_tuned$predict(tsk_sonar, splits$test)

Upvotes: 1

Views: 20

Answers (1)

be-marc
be-marc

Reputation: 1491

You found a bug. It shouldn't be possible to train the learner with TuneToken present in the parameter set. This has nothing to do with the train-test split. If you are really worried by this, you can check the resampling splits in instance$archive$benchmark_result$resamplings after the optimization.

Upvotes: 1

Related Questions