devj
devj

Reputation: 1153

Java garbage collector performance for allocating/deallocating memory inside loop

I've a program where the loop in question looks something like this

int numOfWords = 1000;
int avgSizeOfWord = 20;
while(all documents are not read) {
    char[][] wordsInDoc = new char[numOfWords][avgSizeOfWord];
    for(int i=0; i<numWordsInDoc; i++) {
        wordsInDoc[i] = getNextWord();
    }
    processWords(wordsInDoc);
}

I was wondering what happens behind the scene when this loop gets executed. When does the garbage collector collect the memory that has been assigned for each document? Is their a better way (wrt memory usage) to do the same?

Any insight is appreciated.

Upvotes: 1

Views: 961

Answers (5)

Santosh
Santosh

Reputation: 17893

My few cents :)

  1. I guess when you declare an array, unlike in C/C++ you don't actually reserve memory for the object but you simple create that many references.
  2. Each reference might occupy a certain memory (which will surly be less that the memory occupied by the object its pointing to). So it should not matter if you use plain array or ArrayList (which do the same thing but in a type safe way).
  3. The very basic problem with the approach mentioned is that it loads the entire document in memory and sends it for processing.
  4. Better/Efficient way to stream it out (Buffered) and then process it on the fly. This will prevent the entire document from being loaded in memory.

Regarding GC, as folks here have pointed out, its impossible to predict. It kicks in whenever JVM is running short of memory but that just a cliche sentence :).

Upvotes: 0

NPE
NPE

Reputation: 500177

It is impossible to answer your question in general, as the JVM can pretty much do whatever it wants with regards to garbage collection.

You might be able to gain some insight into what actually happens by running your program under a memory profiler such as YourKit. This will also enable you to compare different strategies (e.g. using the String class instead of char arrays) in terms of memory usage and time spent in the garbage collector.

Upvotes: 4

Peter Lawrey
Peter Lawrey

Reputation: 533472

It is likely you are creating array you are immediately destroying. A more efficient approach is to create the plain array of arrays, or use a List.

char[][] wordsInDoc = new char[numOfWords][];
for(int i=0; i<numWordsInDoc; i++) {
    wordsInDoc[i] = getNextWord();
}
processWords(wordsInDoc);

OR

List<char[]> wordsInDoc = new ArrayList<char[]>();
for(int i=0; i<numWordsInDoc; i++) {
    wordsInDoc.add(getNextWord());
}
processWords(wordsInDoc);

OR use Strings

String line = "Hello World. This is a Sentence";
String[] words = line.split(" +");
processWords(words);

Upvotes: 1

maple_shaft
maple_shaft

Reputation: 10463

The garbage collector works in mysterious ways. Even calling it directly results in merely a suggestion.

If you want to find out when an object is garbage collected you can override finalize() and log output information on the time.

Upvotes: 0

Jon Skeet
Jon Skeet

Reputation: 1499830

Well you're definitely wasting memory - you're allocating all of the "sub-arrays" and then overwriting them. You'd be better off with:

while(all documents are not read) {
    char[][] wordsInDoc = new char[numOfWords][];
    for(int i=0; i < numWordsInDoc; i++) {
        wordsInDoc[i] = getNextWord();
    }
    processWords(wordsInDoc);
}

Now what does processWords actually do? If it doesn't stash the array anywhere, you could reuse it:

char[][] wordsInDoc = new char[numOfWords][];
while(all documents are not read) {
    for(int i=0; i < numWordsInDoc; i++) {
        wordsInDoc[i] = getNextWord();
    }
    processWords(wordsInDoc);
}

I would definitely perform the first change, but probably not the second.

As for when exactly garbage collection occurs - that's implementation-specific.

Upvotes: 3

Related Questions