freixo
freixo

Reputation: 193

Is there any way to free memory during large data proccessing?

I have a database where I store invoices. I have to make complex operations for any given month with a series of algorithms using the information from all of the invoices. Retrieving and processing the necessary data for these operations takes a large amount of memory since there might be lots of invoices. The problem gets increasingly worse when the interval requested by the user for these calculations goes up to several years. The result is I'm getting a PermGen exception because it seems that the garbage collector is not doing its job between each month calculation.

I've always taken using System.GC to hint the GC to do its job is not a good practice. So my question is, are there any other ways to free memory aside from that? Can you force the JVM to use HD swapping in order to store partial calculations temporarily?

Also, I've tried to use System.gc at the end of each month calculation and the result was a high CPU usage (due to garbage collector being called) and a reasonably lower memory use. This could do the job but I don't think this would be a proper solution.

Upvotes: 5

Views: 90

Answers (2)

Sushim Mukul Dutta
Sushim Mukul Dutta

Reputation: 777

We should remember System.gc() does not really run the Garbage Collector. It simply asks to do the same. The JVM may or may not run the Garbage Collector. All we can do is to make unnecessary data structures available for garbage collection. You can do the same by:

  1. Assigning Null as the value of any data structure after it has been used. Hence no active threads will be able to access it( in short enabling it for gc).
  2. reusing the same structures instead of creating new ones.

Upvotes: 1

durron597
durron597

Reputation: 32323

Don't ever use System.gc(). It always takes a long time to run and often doesn't do anything.

The best thing to do is rewrite your code to minimize memory usage as much as possible. You haven't explained exactly how your code works, but here are two ideas:

  • Try to reuse the data structures you generate yourself for each month. So, say you have a list of invoices, reuse that list for the next month.
  • If you need all of it, consider writing the processed files to temporary files as you do the processing, then reloading them when you're ready.

Upvotes: 2

Related Questions