quant
quant

Reputation: 4482

R h2o connection (memory) issue

I am trying to run optimizing grid for 2 algorithms (random forest and gbm) for different parts of a data set, using h2o. My code looks like

for (...)
{
        read data

        # setup h2o cluster
        h2o <- h2o.init(ip = "localhost", port = 54321, nthreads = detectCores()-1)

        gbm.grid <- h2o.grid("gbm", grid_id = "gbm.grid", x = names(td.train.h2o)[!names(td.train.h2o)%like%segment_binary], y = segment_binary, 
                             seed = 42, distribution = "bernoulli",
                             training_frame = td.train.h2o, validation_frame = td.train.hyper.h2o,
                             hyper_params = hyper_params, search_criteria = search_criteria)

    # shutdown h2o
    h2o.shutdown(prompt = FALSE)

    # setup h2o cluster
    h2o <- h2o.init(ip = "localhost", port = 54321, nthreads = detectCores()-1)

    rf.grid <- h2o.grid("randomForest", grid_id = "rf.grid", x = names(td.train.h2o)[!names(td.train.h2o)%like%segment_binary], y = segment_binary, 
                        seed = 42, distribution = "bernoulli",
                        training_frame = td.train.h2o, validation_frame = td.train.hyper.h2o,
                        hyper_params = hyper_params, search_criteria = search_criteria)

    h2o.shutdown(prompt = FALSE)
}

The problem is that if i run the for loop in one go, i get the error

Error in .h2o.doSafeREST(h2oRestApiVersion = h2oRestApiVersion, urlSuffix = urlSuffix,  : 
  Unexpected CURL error: Failed to connect to localhost port 54321: Connection refused

P.S.: I am using the line

 # shutdown h2o
h2o.shutdown(prompt = FALSE)

# setup h2o cluster
h2o <- h2o.init(ip = "localhost", port = 54321, nthreads = detectCores()-1)

So that I "reset" the h2o, so that i do not run out of memory

I also read R H2O - Memory management but it is not clear to me how it works.

UPDATE

After following Matteusz comment, i init outside the for loop and inside of the for loop i use h2o.removeAll(). So now my code looks like this

 h2o <- h2o.init(ip = "localhost", port = 54321, nthreads = detectCores()-1)
for(...)
{
read data

gbm.grid <- h2o.grid("gbm", grid_id = "gbm.grid", x = names(td.train.h2o)[!names(td.train.h2o)%like%segment_binary], y = segment_binary, 
                             seed = 42, distribution = "bernoulli",
                             training_frame = td.train.h2o, validation_frame = td.train.hyper.h2o,
                             hyper_params = hyper_params, search_criteria = search_criteria)

h2o.removeAll()

rf.grid <- h2o.grid("randomForest", grid_id = "rf.grid", x = names(td.train.h2o)[!names(td.train.h2o)%like%segment_binary], y = segment_binary, 
                        seed = 42, distribution = "bernoulli",
                        training_frame = td.train.h2o, validation_frame = td.train.hyper.h2o,
                        hyper_params = hyper_params, search_criteria = search_criteria)

h2o.removeAll() }

It seems to work, but now i get this error (?) in the grid optimization for random forest

enter image description here

Any ideas what this might be ?

Upvotes: 1

Views: 1560

Answers (2)

Erin LeDell
Erin LeDell

Reputation: 8819

The cause of the error is that you are not changing the grid_id parameter in your loop. My recommendation is to let H2O auto-generate a grid id by leaving it unspecified/NULL. You can also create different grid ids (one for each dataset) manually, but it's not required.

You can only add new models to an existing grid (by re-using the same grid id) when you use the same training set. When you put a grid search in a for loop over different datasets and keep the same grid id, it will throw an error because you are trying to append models trained on different datasets to the same grid.

Upvotes: 1

Mateusz Dymczyk
Mateusz Dymczyk

Reputation: 15141

This seems quite wasteful, starting up h2o twice every iteration. If you just want to free up the memory you can use h2o.removeAll() instead.

As for the cause, h2o.shutdown() (any H2O shutdown) is not a synchronized operation and some cleanup can still occur after the function returns (for example handling of outstanding requests). You can check using h2o.clusterIsUp() whether the cluster is actually down before starting it again with init.

Upvotes: 3

Related Questions