aneuryzm
aneuryzm

Reputation: 64874

What's an effective way to reuse ArrayLists in a for loop?

I'm reusing the same ArrayList in a for loop, and I use

for loop
    results = new ArrayList<Integer>();
    experts = new ArrayList<Integer>();
    output = new ArrayList<String>();
....

to create new ones.

I guess this is wrong, because I'm allocating new memory. Is this correct ? If yes, how can I empty them ?

Added: another example

I'm creating new variables each time I call this method. Is this good practice ? I mean to create new precision, relevantFound.. etc ? Or should I declare them in my class, outside the method to not allocate more and more memory ?

public static void computeMAP(ArrayList<Integer> results, ArrayList<Integer> experts) {

  //compute MAP
  double precision = 0;
  int relevantFound = 0;
  double sumprecision = 0;

thanks

Upvotes: 3

Views: 6104

Answers (5)

Christian Garbin
Christian Garbin

Reputation: 2532

The code snippet below measures the difference between allocating a new list inside the loop and calling clear() to reuse an existing list.

Allocating a new list is slower, as pointed out a few times above. This gives an idea of how much.

Note that the code loops 100,000 times to get those numbers. For UI code the difference may not matter. For other applications it can be a significant improvement to reuse the list.

This is the result of three runs:

Elapsed time - in the loop: 2198 
Elapsed time - with clear(): 1621

Elapsed time - in the loop: 2291 
Elapsed time - with clear(): 1621   

Elapsed time - in the loop: 2182 
Elapsed time - with clear(): 1605

Having said that, if the lists are holding hundreds or even thousands of objects, the allocation of the array itself will pale in comparison with the allocation of the objects. The performance bottleneck will be related to the objects being added to the array, not with the array.

For completeness: code was measured with Java 1.6.0_19, running on a Centrino 2 laptop with Windows. However, the main point is the difference between them, not the exact number.

import java.util.*;

public class Main {
    public static void main(String[] args)    {

      // Allocates a new list inside the loop
      long startTime = System.currentTimeMillis();
      for( int i = 0; i < 100000; i++ ) {
         List<String> l1 = new ArrayList<String>();
         for( int j = 0; j < 1000; j++ )
            l1.add( "test" );
      }
      System.out.println( "Elapsed time - in the loop: " + (System.currentTimeMillis() - startTime) );

      // Reuse the list
      startTime = System.currentTimeMillis();
      List<String> l2 = new ArrayList<String>();
      for( int i = 0; i < 100000; i++ ) {
         l2.clear();
         for( int j = 0; j < 1000; j++ )
            l2.add( "test" );
      }
      System.out.println( "Elapsed time - with clear(): " + (System.currentTimeMillis() - startTime) );
    }
}

Upvotes: 2

tzaman
tzaman

Reputation: 47860

ArrayList.clear() will empty them for you; note that doing it your way is also 'okay', since Java is garbage-collected, so the old allocations will eventually get cleaned up. Still, it's better to avoid lots of new allocations (and garbage generation), so the better way would be to move those declarations outside the loop and put in calls to clear inside it.

For your second example, either way is fine; primitive types are typically going to get allocated only once (on the stack, when you enter the function), declaring them inside a loop doesn't increase the cost any. It's only heap allocations (i.e. calls to new) you need to worry about.

In response to comment:
If it doesn't make sense for those things to be instance members, then don't make them such. Also, using new to 'clean' them means allocating new objects every time; definitely don't do that - if your method needs a new copy of something every time it's invoked, and it isn't used anywhere except that method, then it has no business being an instance variable.
In general, worrying about such micro-optimizations at this point is counter-productive; you only think about it if you really, absolutely have to, and then measure whether there's a benefit before doing anything.

Upvotes: 6

Jeff
Jeff

Reputation: 145

first, allocating primitive types is practically free in java, so don't worry about that.

with regard to objects, it really depends on the loop. if it's a tight loop to 100k then yes, it's a big deal to allocate 3 array list objects each time through the loop. it'd be better to allocate them outside of the loop and use List.clear().

you also have to consider where the code is running. if it's a mobile platform you will be more concerned about frequent garbage collection than you would on a server with 256GB of ram and 64 CPUs.

that all being said, no one if going to beat you up for coding for performance, whatever the platform. performance is often a trade off with code cleanliness. for example, on the android platform they recommend using the for (int i = 0 ...) syntax to loop through array lists vs. for (Object o: someList). the latter method is cleaner, but on a mobile platform the performance difference is significant. in this case i don't think clear()'ing outside of the loop makes things any harder to understand.

Upvotes: 1

Eyal Schneider
Eyal Schneider

Reputation: 22456

Another issue to take into consideration here is how garbage collection is affected. It is clear that reusing the same ArrayList references and using ArrayList.clear() reduces instance creations.

However, garbage collection is not so simple, and apparently here we force 'old' objects to reference 'newer' objects. That means more old-to-young references (i.e. references from objects in the old generation to ones in the young generation). This kind of references result in more work during garbage collection (See this article for example).

I never tried to benchmark this, and I don't know how significant this is, but I thought it could be relevant for this discussion. Maybe if the number of list items significantly outnumbers the number of lists, it is not worthwhile to use the same lists.

Upvotes: 0

Daniel
Daniel

Reputation: 28104

ArrayLists allocate a default memory for 5 entries. These entries are references, which need 4 bytes each (depending on architecture, maybe even 8 byte). An array list contains an int for its "real" length, which are already 24 bytes. Add the default 16 bytes, which every object (even without instance variables) has, so you at with at least 40 bytes for each ArrayList(). Depending if the store them all, or how many you have, this might be a performance loss.

Not however, that starting with Java 1.6.16, the JVM has a (default off?) feature, which "inlines" objects within a function, if no access to those objects leaves the methods context. In this case all instance variables are compiled in as being used as "local" instance variables of the calling function, so no real objects will be created.

Upvotes: 0

Related Questions