Reputation: 13087

Array and ArrayList positional access performance

i just read this code example: http://robaustin.wikidot.com/how-does-the-performance-of-arraylist-compare-to-array

what causing j = INT_ARRAY[i]; to be three times faster than j = ARRAY_LIST.get(i)

I know that ArrayList internally uses an array. So i would like to know in details what are the extra operations that add this time ( calling methods, casting, other JVM considerations etc..)

Thanks in advance.

Upvotes: 3

Answers (6)

Kostadin

Reputation: 2569

There is not best way. I have 3 different approaches based on differnet situations: 1. When I have to do very big for cycle with low cost operations inside - yes improve access to data gives to you good percent of optimisation. 2. If case is same as 1. but with heavy operations inside - optimize access is just a very litle optimization - better is to optimize fields in object and operations with them. 3. Lots of cycles with heavy calculations and nothing more to optimize - make some 'bad practice' programming. For example: Instead of returning new value every time - pass work variable to function and return result in it. It sounds stupid, but reduces variable creation and memory fragmentation. In my case this gived to me 15-25% better time. Reason? Reduced GC calls. No spend time to call constructors.

Upvotes: 0

Rex Kerr

Reputation: 167921

The test is badly written. You can't learn much of anything from it.

It is true that accessors like get take some time, but e.g. the Sun JVM can optimize many of those away to almost nothing. In particular, ArrayList get needn't really take any significant extra time.

Here's a benchmark (written in Scala, but using Java's arrays and ArrayList) demonstrating just how small the difference is when you're actually using (all of) the values in your array:

object ArraySpeed {
  def ptime[A](f: => A) = {
    val t0 = System.nanoTime
    val ans = f
    printf("Elapsed: %.3f seconds\n",(System.nanoTime-t0)*1e-9)
    ans
  }

  val a = Array.range(0,1000000).map(x => new java.lang.Integer(x))
  val b = new java.util.ArrayList[java.lang.Integer]
  a.foreach(x => b.add(x))

  var j = 0

  def jfroma = {
    var i=0
    while (i<1000000) {
      j += a(i).intValue
      i += 1
    }
    j
  }

  def jfromb = {
    var i=0
    while (i<1000000) {
      j += b.get(i).intValue
      i += 1
    }
    j
  }

  def main(args: Array[String]) {
    for (i <- 1 to 5) {
      ptime(for (j <- 1 to 100) yield jfroma)
      ptime(for (j <- 1 to 100) yield jfromb)
      println
    }
  }
}

Running it gives:

$ scalac ArraySpeed.scala
$ scala ArraySpeed
Elapsed: 0.324 seconds   // This is direct array access
Elapsed: 0.378 seconds   // This is ArrayList

Elapsed: 0.326 seconds
Elapsed: 0.389 seconds

Elapsed: 0.355 seconds
Elapsed: 0.349 seconds

Elapsed: 0.323 seconds
Elapsed: 0.333 seconds

Elapsed: 0.318 seconds
Elapsed: 0.331 seconds

And Scala bytecode for stuff like this is pretty much identical to the Java bytecode, so this is a pretty fair comparison. (The scala command is just a wrapper to call java with the right library in the classpath.)

Upvotes: 3

James Kingsbery

Reputation: 7496

It probabaly helps in figuring out what caused the difference (on his particular processor, JVM, OS, etc.) to look at the generated byte code.

For readFromArrayList:

   6:   goto    25
   9:   getstatic       #47; //Field ARRAY_LIST:Ljava/util/List;
   12:  iload_3
   13:  invokeinterface #116,  2; //InterfaceMethod java/util/List.get:(I)Ljava/lang/Object;
   18:  checkcast       #17; //class java/lang/Integer
   21:  astore_0
   22:  iinc    3, 1
   25:  iload_3
   26:  getstatic       #25; //Field INT_ARRAY:[Ljava/lang/Integer;
   29:  arraylength
   30:  if_icmplt       9

For readFromArray:

   6:   goto    18
   9:   getstatic       #25; //Field INT_ARRAY:[Ljava/lang/Integer;
   12:  iload_3
   13:  aaload
   14:  astore_0
   15:  iinc    3, 1
   18:  iload_3
   19:  getstatic       #25; //Field INT_ARRAY:[Ljava/lang/Integer;
   22:  arraylength
   23:  if_icmplt       9

I don't know if I buy the "three times" difference, but whatever differences there are can be traced to the differences at op #13: aaload (for the array) vs. invokeinterface and checkcast (for the ArrayList).

Upvotes: 2

zcourts

Reputation: 5041

Without checking the link posted and assuming that an array list is 3 times slower as you say then the speed difference is likely to vary accross JVMs, several things could influence the speed of getting a value back. The results from that article are likely to vary depending on the control variables in effect when performing the tests. A collection should not be surprisingly a bit slower either considering that various checks are completed before completing the operation. Adding to an arraylist for example calls

 public void ensureCapacity(int minCapacity) {
    modCount++;
    int oldCapacity = elementData.length;
    if (minCapacity > oldCapacity) {
        Object oldData[] = elementData;
        int newCapacity = (oldCapacity * 3)/2 + 1;
            if (newCapacity < minCapacity)
        newCapacity = minCapacity;
            // minCapacity is usually close to size, so this is a win:
            elementData = Arrays.copyOf(elementData, newCapacity);
    }
    }

As that clearly shows there are checks and data is copied and replaced. So to put it simply there is no one answer, the test conditions are likely to affect the results.

Upvotes: -2

Bryan Kyle

Reputation: 13771

My understanding is that the JVM has specific opcodes for working with arrays. It's likely the performance difference is overhead of method calls and such. Why not write a simple test case and use javad to have a look at what the code compiles down into. That should give you an idea.

Upvotes: 1

Jon Skeet

Reputation: 1503859

The performance will very much depend on the VM involved, and a variety of other considerations. The blanket statement at the start of that article leads me to suspect that the author has little clue about how performance varies on JVM - and the rest of the test code confirms that. It doesn't test for nearly long enough, nor does it use any JVM warmup period or anything similar. Oh, and it uses INT_ARRAY.length when testing the ArrayList version, meaning that one potential source of JIT optimization is removed. Really not a good article.

However, it's fairly easy to consider the things that ArrayList.get() involves beyond the normal array access:

A nullity check (to see whether the ArrayList reference is non-null). This is in addition to the nullity check for the array itself, which is required for both the array and ArrayList cases.
Potentially a virtual method indirection, depending on whether the JIT has managed to inline the call
A bounds check - not the same as for the array access, as the size of the list is usually smaller than the length of the array.

Ultimately though, the performance of one single method call doesn't matter much. What mattes is whether that's significant in your actual use case. Does your application spend most of its time getting individual elements from a collection? Does it do so in the kind of loop shown in the article, which does nothing else and may therefore benefit from additional JIT optimizations in one case or the other?

Microbenchmarking is fun, but you need to be aware of its limitations when it comes to providing you useful information.

Upvotes: 10

Array and ArrayList positional access performance

Answers (6)

Related Questions