Reputation: 1582
We have an application that consists of usually ~20 JVMs and we distribute batch jobs to them. The 20 JVMs run in the same Operating System. Before dispatching a batch job to one of them, it's hard to tell how long and how big the job is. It could take 1 minutes or several hours. Memory consumption is similarly varying.
So far this worked well, we have a total of 40GB Memory available, we had max heap size set to 2GB for each JVM (2GB is necessary sometimes). Since it was never the case that we had too many "big" batch jobs running at the same time, we never had memory issues. Until we moved to the Java 8 vm. It seems that the full GC is triggered less frequently. We have JVM being mostly idle rising in Memory usage. When I trigger a GC by calling jcmd, I can see the OldGen going down from like 1GB to 200MB.
I know this is not a good setup to have 20 JVMs with max 2GB Heap + Stack + Metaspace that would in max total be a lot more than the 40GB memory available. But it's a Situation we have to live with. And I'd be surprised if there is a way to set a max heap size for a Cluster of several JVMs. So I need to come up with other solutions.
I was looking for some VM option that tells the VM to do a full GC in regular intervals, this would very likely solve our problem. But I can't find a VM Option to do this.
Any suggestions on how we can set this up to avoid memory swapping?
EDIT: Here is a snippet from the gc log:
2016-04-14T01:02:49.413+0200: 37428.762: [Full GC (Ergonomics) [PSYoungGen: 28612K->0K(629248K)] [ParOldGen: 1268473K->243392K(1309184K)] 1297086K->243392K(1938432K), [Metaspace: 120332K->120320K(1181696K)], 0.3438924 secs] [Times: user=1.69 sys=0.02, real=0.35 secs]
2016-04-14T01:02:52.442+0200: 37431.792: [GC (Allocation Failure) [PSYoungGen: 561664K->67304K(629248K)] 805056K->310696K(1938432K), 0.0315138 secs] [Times: user=0.26 sys=0.00, real=0.03 secs]
2016-04-14T01:02:54.809+0200: 37434.159: [GC (Allocation Failure) [PSYoungGen: 628968K->38733K(623104K)] 872360K->309555K(1932288K), 0.0425780 secs] [Times: user=0.35 sys=0.00, real=0.04 secs]
...
2016-04-14T10:09:03.558+0200: 70202.907: [GC (Allocation Failure) [PSYoungGen: 547152K->41386K(531968K)] 1545772K->1041036K(1841152K), 0.0255883 secs] [Times: user=0.18 sys=0.00, real=0.02 secs]
2016-04-14T10:20:53.634+0200: 70912.984: [GC (Allocation Failure) [PSYoungGen: 531882K->40733K(542720K)] 1531532K->1042107K(1851904K), 0.0306816 secs] [Times: user=0.22 sys=0.02, real=0.03 secs]
2016-04-14T10:23:10.830+0200: 71050.180: [GC (System.gc()) [PSYoungGen: 60415K->37236K(520192K)] 1061790K->1040674K(1829376K), 0.0228505 secs] [Times: user=0.17 sys=0.01, real=0.02 secs]
2016-04-14T10:23:10.853+0200: 71050.203: [Full GC (System.gc()) [PSYoungGen: 37236K->0K(520192K)] [ParOldGen: 1003438K->170089K(1309184K)] 1040674K->170089K(1829376K), [Metaspace: 133559K->129636K(1196032K)], 1.4149811 secs] [Times: user=11.10 sys=0.02, real=1.42 secs]
If we had a full GC every hour, it would solve our Problem, I guess.
Upvotes: 1
Views: 264
Reputation: 43052
Instead of attempting to use time-triggered GCs you could try running with -XX:GCTimeRatio=14 -XX:MaxHeapFreeRatio=30 -XX:MixHeapFreeRatio=20
. This will tell the collector to keep less headroom and do so by allowing it to collect more often/spending more CPU cycles on GCs.
On current JDK9 builds this could be further combined with -XX:-ShrinkHeapInSteps to let the allocated heap size trail the used heap even more closely. Again, potentially at the expense of performance.
Upvotes: 1
Reputation: 1582
Thanks to all answers/comments. the solution I came up with is a combination of plenty of answers/comments.
@Peter Lawrey: Calling System.gc() after every Batch run makes a lot of sense and I'm amazed we didn't come up with this earlier. It alone didn't help shrink the Memory usage. We would just end up with a 1GB Old Generation that was only filled up with 200MB of data.
@the8472: GCTimeRatio didn't seem to help us in any way. But we changed MaxHeapFreeRatio and MinHeapFreeRatio both to 40. Choosing lower values restricted the size of the Young Generation too much and it never grew over 200MB. I assume that setting both parameters to the same value will cause a lot of Memory allocations and deallocations, but we're still doing good with <1% time spent in GC. When you're doing plenty of database requests, the Performance impace of GC becomes neglectable :-)
@Sisyphus: Setting the newRatio to 1 helped in letting the Young Generation and Old Generation have similar sizes. This is probably the Change with the highest benefit.
Upvotes: 0
Reputation: 533510
There is no point doing a GC at random times.
I would add the GC to the end of a batch (or after it). It as this point that the least memory is likely to need to be retained making the GC faster, and get the best shrinkage.
Upvotes: 2