swqnzd
swqnzd

Reputation: 61

Why does the JVM slow down over time only when using the CMS garbage collector?

I have an application that uses Nashorn. For the purposes of my example, I create a ScriptContext that I prepare by executing some Javascript to create some globals, and then use that context over and over again in a single thread by calling NashornScriptEngine#eval(String, ScriptContext) in a tight loop. I do not store the result anywhere and as far as I can tell my application code does not cause any side effects.

With the default GC this works fine indefinitely. But when I run the same application with -XX:+UseConcMarkSweepGC performance significantly degrades over time. When the program starts, it takes about 2 minutes to run 1,000,000 iterations. But after 2 hours, the same 1,000,000 iterations takes about 4 minutes. It continues to get worse from there.

I've also tested regularly throwing away the NashornScriptEngine instance along with the ScriptContext, starting over entirely. At that point, my application has no reference to any variable from the previous execution. That does not improve these performance issues.

Any idea what's going on? I need to run with -XX:+UseConcMarkSweepGC because this is just a small piece of a larger long lived application.

I have some screenshots from Java Mission Control below (taken from Flight Recorder).

Thanks!

Heap Reference Objects Top Growers

Here I've picked two GCs, one from the beginning of the recording, one from the end. Note how "JNI Weak Reference" time increases significantly, as does "GC Pause".

GC At Beginning GC At End

Upvotes: 3

Views: 1293

Answers (2)

egorlitvinenko
egorlitvinenko

Reputation: 2776

There is a bug https://bugs.openjdk.java.net/browse/JDK-8177098. Could you try workaround - recreate NashornScriptEngineFactory each time?

Upvotes: 3

the8472
the8472

Reputation: 43052

As you noted yourself it spends a lot of time in JNI ref processing. This is single-threaded by default. Set -XX:+ParallelRefProcEnabled to let it run in parallel.

Since you're using the nashorn engine which dynamically generates bytecode it might also be an issue with class unloading or lack thereof, but that is not evident from your logs.

Upvotes: 0

Related Questions