Reputation: 1196
We have an application that spawns new JVMs and executes code on behalf of our users. Sometimes those run out of memory, and in that case behave in very different ways. Sometimes they throw an OutOfMemoryError, sometimes they freeze. I can detect the latter by a very lightweight background thread that stops to send heartbeat signals when running low on memory. In that case, we kill the JVM, but we can never be absolutely sure what the real reason for failing to receive the heartbeat was. (It could as well have been a network issue or a segmentation fault.)
What is the best way to reliably detect out of memory conditions in a JVM?
In theory, the -XX:OnOutOfMemoryError option looks promising, but it is effectively unusable due to this bug: https://bugs.openjdk.java.net/browse/JDK-8027434
Catching an OutOfMemoryError is actually not a good alternative for well-known reasons (e.g. you never know where it happens), though it does work in many cases.
The cases that remain are those where the JVM freezes and does not throw an OutOfMemoryError. I'm still sure the memory is the reason for this issue.
Are there any alternatives or workarounds? Garbage collection settings to make the JVM terminate itself rather than freezing?
EDIT: I'm in full control of both the forking and the forked JVM as well as the code being executed within those, both are running on Linux, and it's ok to use OS specific utilities if that helps.
Upvotes: 11
Views: 2257
Reputation: 10423
The only real option is (unfortunately) to terminate the JVM as soon as possible.
Since you probably cant change all your code to catch the error and respond. If you don't trust the OnOutOfMemoryError
(I wonder why it should not use vfork which is used by Java 8, and it works on Windows), you can at least trigger an heapdump and monitor externally for those files:
java .... -XX:+HeapDumpOnOutOfMemoryError "-XX:OnOutOfMemoryError=kill %p"
Upvotes: 2
Reputation: 444
In case you do have control both over the application and configuration, the best solution would be to find the underlying cause for the OutOfMemoryError being thrown and fix this, instead of trying to hide the symptoms either by catching the error or just restarting JVMs.
From what you describe, it definitely looks that either the application running on the JVM is leaking memory, is just running using under-provisioned resources (memory in your case) or is occasionally processing transactions requiring abnormally large chunks of heap. Solutions for those cases would be different:
Upvotes: 0
Reputation: 1196
After experimenting with this for quite some time, this is the solution that worked for us:
OutOfMemoryError
and exit immediately, signalling the out of memory condition with an exit code to the controller JVM.Runtime
. When the amount of memory used is close to critical, create a flag file that signals the out of memory condition to the controller JVM. If we recover from this condition and exit normally, delete that file before we exit.hs_err_pidXXX.log
exists and contains the line "Out of Memory Error". (This file is generated by java in case it crashes.)Only after implementing all of those checks were we able to handle all cases where the forked JVM ran out of memory. We believe that since then, we have not missed a case where this happened.
The java flag -XX:OnOutOfMemoryError
was not used because of the fork problem, and -XX:+HeapDumpOnOutOfMemoryError
was not used because a heap dump is more than we need.
The solution is certainly not the most elegant piece of code ever written, but did the job for us.
Upvotes: 1