Reputation: 378
I am running the R package h2o
version 3.20.0.2 on an azure cluster.
After fitting many h2o models, the h2o cluster seems to have become unresponsive with this error message:
Warning in .h2o.__checkConnectionHealth() : H2O cluster node 127.0.0.1:54321 is behaving slowly and should be inspected manually.
I have tried to reset the cluster with h2o.shutdown()
but the problem persists and h2o.init()
fails.
Without admin rights, how can I truly restart the h2o server and how would I avoid this problem in the future ?
Upvotes: 2
Views: 1681
Reputation: 3671
The most common reason for this is you have used all the memory in the cluster.
Options include doing things like:
h2o.shutdown() uses an api call to the backend to do a cooperative shutdown, but if the backend is already in a bad state it may not work.
If you are running R on the same host as the H2O server, you can do things like system(“ps -ef”) in R to run linux shell commands and try to fix it up that way, even without a direct terminal prompt. Find the h2o java process and kill it.
Upvotes: 1