We have an application that spawns new JVMs and executes code on behalf of our users. Sometimes those run out of memory, and in that case behave in very different ways. Sometimes they throw an OutOfMemoryError, sometimes they freeze. I can detect the latter by a very lightweight background thread that stops to send heartbeat signals when running low on memory. In that case, we kill the JVM, but we can never be absolutely sure what the real reason for failing to receive the heartbeat was. (It could as well have been a network issue or a segmentation fault.) What is the best way to reliably detect out of memory conditions in a JVM? In theory, the -XX:OnOutOfMemoryError option looks promising, but it is effectively unusable due to this bug: https://bugs.openjdk.java.net/browse/JDK-8027434 Catching an OutOfMemoryError is actually not a good alternative for well-known reasons (e.g. you never know where it happens), though it does work in many cases. The cases that remain are those where the JVM freezes and does not throw an OutOfMemoryError. I'm still sure the memory is the reason for this issue. Are there any alternatives or workarounds? Garbage collection settings to make the JVM terminate itself rather than freezing? EDIT: I'm in full control of both the forking and the forked JVM as well as the code being executed within those, both are running on Linux, and it's ok to use OS specific utilities if that helps.

After experimenting with this for quite some time, this is the solution that worked for us: In the spawned JVM, catch an OutOfMemoryError and exit immediately, signalling the out of memory condition with an exit code to the controller JVM. In the spawned JVM, periodically check the amount of consumed memory of the current Runtime . When the amount of memory used is close to critical, create a flag file that signals the out of memory condition to the controller JVM. If we recover from this condition and exit normally, delete that file before we exit. After the controlling JVM joins the forked JVM, it checks the exit code generated in step (1) and the flag file generated in step (2). In addition to that, it checks whether the file hs_err_pidXXX.log exists and contains the line "Out of Memory Error". (This file is generated by java in case it crashes.) Only after implementing all of those checks were we able to handle all cases where the forked JVM ran out of memory. We believe that since then, we have not missed a case where this happened. The java flag -XX:OnOutOfMemoryError was not used because of the fork problem, and -XX:+HeapDumpOnOutOfMemoryError was not used because a heap dump is more than we need. The solution is certainly not the most elegant piece of code ever written, but did the job for us.

javalinuxprocessgarbage-collectionout-of-memory

Simon Fischer

Reputation: 1196

What is the best way to handle out of memory conditions in Java?

We have an application that spawns new JVMs and executes code on behalf of our users. Sometimes those run out of memory, and in that case behave in very different ways. Sometimes they throw an OutOfMemoryError, sometimes they freeze. I can detect the latter by a very lightweight background thread that stops to send heartbeat signals when running low on memory. In that case, we kill the JVM, but we can never be absolutely sure what the real reason for failing to receive the heartbeat was. (It could as well have been a network issue or a segmentation fault.)

What is the best way to reliably detect out of memory conditions in a JVM?

In theory, the -XX:OnOutOfMemoryError option looks promising, but it is effectively unusable due to this bug: https://bugs.openjdk.java.net/browse/JDK-8027434
Catching an OutOfMemoryError is actually not a good alternative for well-known reasons (e.g. you never know where it happens), though it does work in many cases.
The cases that remain are those where the JVM freezes and does not throw an OutOfMemoryError. I'm still sure the memory is the reason for this issue.

Are there any alternatives or workarounds? Garbage collection settings to make the JVM terminate itself rather than freezing?

EDIT: I'm in full control of both the forking and the forked JVM as well as the code being executed within those, both are running on Linux, and it's ok to use OS specific utilities if that helps.

Upvotes: 11

Answers (3)

eckes

Reputation: 10423

The only real option is (unfortunately) to terminate the JVM as soon as possible.

Since you probably cant change all your code to catch the error and respond. If you don't trust the OnOutOfMemoryError (I wonder why it should not use vfork which is used by Java 8, and it works on Windows), you can at least trigger an heapdump and monitor externally for those files:

java .... -XX:+HeapDumpOnOutOfMemoryError "-XX:OnOutOfMemoryError=kill %p"

Upvotes: 2

Ivo

Reputation: 444

In case you do have control both over the application and configuration, the best solution would be to find the underlying cause for the OutOfMemoryError being thrown and fix this, instead of trying to hide the symptoms either by catching the error or just restarting JVMs.

From what you describe, it definitely looks that either the application running on the JVM is leaking memory, is just running using under-provisioned resources (memory in your case) or is occasionally processing transactions requiring abnormally large chunks of heap. Solutions for those cases would be different:

In case of a memory leak, find the underlying cause and have engineers fix it. Tools for this include heap dump analyzers, profilers or leak detectors
In case of under-provisioned resources you need to monitor the application memory consumption, for example via garbage collection logs and adjust the sizes of different memory pools based on what you face.
In case of surge allocations during user transactions, you need to trace down the code causing the surge it and having engineers to fix it - via disabling certain user inputs or loading and processing the data in smaller batches. Either thread dumps or heap dumps from the processes can guide you towards the solution.

Upvotes: 0

Simon Fischer

Reputation: 1196

After experimenting with this for quite some time, this is the solution that worked for us:

In the spawned JVM, catch an OutOfMemoryError and exit immediately, signalling the out of memory condition with an exit code to the controller JVM.
In the spawned JVM, periodically check the amount of consumed memory of the current Runtime. When the amount of memory used is close to critical, create a flag file that signals the out of memory condition to the controller JVM. If we recover from this condition and exit normally, delete that file before we exit.
After the controlling JVM joins the forked JVM, it checks the exit code generated in step (1) and the flag file generated in step (2). In addition to that, it checks whether the file hs_err_pidXXX.log exists and contains the line "Out of Memory Error". (This file is generated by java in case it crashes.)

Only after implementing all of those checks were we able to handle all cases where the forked JVM ran out of memory. We believe that since then, we have not missed a case where this happened.

The java flag -XX:OnOutOfMemoryError was not used because of the fork problem, and -XX:+HeapDumpOnOutOfMemoryError was not used because a heap dump is more than we need.

The solution is certainly not the most elegant piece of code ever written, but did the job for us.

Upvotes: 1

What is the best way to handle out of memory conditions in Java?

Answers (3)

Related Questions