Reputation: 525
Is there any way to extract the time/memory required for each iteration of a grid search? I am looking to plot an outcome metric (e.g. AUC) vs. processing requirements to examine the cost-benefit of adding complexity to my model.
Upvotes: 1
Views: 199
Reputation: 8819
I assume by "each iteration of the grid search", you mean each model in the grid search. So you're asking how to find model training times in a grid search. If so, here's how to do that.
H2O stores the model start time in the model (as milliseconds since unix epoch time). You can use that to determine the time between models -- this can be used to deduce the training time for any model (except the last one).
In R, the model start time is stored at my_model@model$start_time
.
Here is an example using the iris dataset and a GBM grid:
library(h2o)
h2o.init()
# Load iris dataset
data("iris")
train <- as.h2o(iris)
# GBM hyperparamters
gbm_params <- list(learn_rate = seq(0.01, 0.1, 0.01),
max_depth = seq(2, 10, 1),
sample_rate = seq(0.5, 1.0, 0.1),
col_sample_rate = seq(0.1, 1.0, 0.1))
search_criteria <- list(strategy = "RandomDiscrete", max_models = 5)
# Train and cross-validate a grid of GBMs
gbm_grid <- h2o.grid("gbm", x = 1:4, y = 5,
training_frame = train,
nfolds = 3,
ntrees = 100,
seed = 1,
hyper_params = gbm_params,
search_criteria = search_criteria)
# Model Start times (milliseconds since unix epoch)
start_times <- sort(sapply(gbm_grid@model_ids, function(m) h2o.getModel(m)@model$start_time))
# Model training times (milliseconds)
train_time_ms <- start_times[2:length(start_times)] - start_times[1:(length(start_times)-1)]
print(train_time_ms)
# 758 662 532 469
Upvotes: 1