Reputation: 13087
i just read this code example: http://robaustin.wikidot.com/how-does-the-performance-of-arraylist-compare-to-array
what causing j = INT_ARRAY[i]; to be three times faster than j = ARRAY_LIST.get(i)
I know that ArrayList internally uses an array. So i would like to know in details what are the extra operations that add this time ( calling methods, casting, other JVM considerations etc..)
Thanks in advance.
Upvotes: 3
Views: 4154
Reputation: 2569
There is not best way. I have 3 different approaches based on differnet situations: 1. When I have to do very big for cycle with low cost operations inside - yes improve access to data gives to you good percent of optimisation. 2. If case is same as 1. but with heavy operations inside - optimize access is just a very litle optimization - better is to optimize fields in object and operations with them. 3. Lots of cycles with heavy calculations and nothing more to optimize - make some 'bad practice' programming. For example: Instead of returning new value every time - pass work variable to function and return result in it. It sounds stupid, but reduces variable creation and memory fragmentation. In my case this gived to me 15-25% better time. Reason? Reduced GC calls. No spend time to call constructors.
Upvotes: 0
Reputation: 167921
The test is badly written. You can't learn much of anything from it.
It is true that accessors like get
take some time, but e.g. the Sun JVM can optimize many of those away to almost nothing. In particular, ArrayList
get needn't really take any significant extra time.
Here's a benchmark (written in Scala, but using Java's arrays and ArrayList
) demonstrating just how small the difference is when you're actually using (all of) the values in your array:
object ArraySpeed {
def ptime[A](f: => A) = {
val t0 = System.nanoTime
val ans = f
printf("Elapsed: %.3f seconds\n",(System.nanoTime-t0)*1e-9)
ans
}
val a = Array.range(0,1000000).map(x => new java.lang.Integer(x))
val b = new java.util.ArrayList[java.lang.Integer]
a.foreach(x => b.add(x))
var j = 0
def jfroma = {
var i=0
while (i<1000000) {
j += a(i).intValue
i += 1
}
j
}
def jfromb = {
var i=0
while (i<1000000) {
j += b.get(i).intValue
i += 1
}
j
}
def main(args: Array[String]) {
for (i <- 1 to 5) {
ptime(for (j <- 1 to 100) yield jfroma)
ptime(for (j <- 1 to 100) yield jfromb)
println
}
}
}
Running it gives:
$ scalac ArraySpeed.scala
$ scala ArraySpeed
Elapsed: 0.324 seconds // This is direct array access
Elapsed: 0.378 seconds // This is ArrayList
Elapsed: 0.326 seconds
Elapsed: 0.389 seconds
Elapsed: 0.355 seconds
Elapsed: 0.349 seconds
Elapsed: 0.323 seconds
Elapsed: 0.333 seconds
Elapsed: 0.318 seconds
Elapsed: 0.331 seconds
And Scala bytecode for stuff like this is pretty much identical to the Java bytecode, so this is a pretty fair comparison. (The scala
command is just a wrapper to call java
with the right library in the classpath.)
Upvotes: 3
Reputation: 7496
It probabaly helps in figuring out what caused the difference (on his particular processor, JVM, OS, etc.) to look at the generated byte code.
For readFromArrayList:
6: goto 25
9: getstatic #47; //Field ARRAY_LIST:Ljava/util/List;
12: iload_3
13: invokeinterface #116, 2; //InterfaceMethod java/util/List.get:(I)Ljava/lang/Object;
18: checkcast #17; //class java/lang/Integer
21: astore_0
22: iinc 3, 1
25: iload_3
26: getstatic #25; //Field INT_ARRAY:[Ljava/lang/Integer;
29: arraylength
30: if_icmplt 9
For readFromArray:
6: goto 18
9: getstatic #25; //Field INT_ARRAY:[Ljava/lang/Integer;
12: iload_3
13: aaload
14: astore_0
15: iinc 3, 1
18: iload_3
19: getstatic #25; //Field INT_ARRAY:[Ljava/lang/Integer;
22: arraylength
23: if_icmplt 9
I don't know if I buy the "three times" difference, but whatever differences there are can be traced to the differences at op #13: aaload (for the array) vs. invokeinterface and checkcast (for the ArrayList).
Upvotes: 2
Reputation: 5041
Without checking the link posted and assuming that an array list is 3 times slower as you say then the speed difference is likely to vary accross JVMs, several things could influence the speed of getting a value back. The results from that article are likely to vary depending on the control variables in effect when performing the tests. A collection should not be surprisingly a bit slower either considering that various checks are completed before completing the operation. Adding to an arraylist for example calls
public void ensureCapacity(int minCapacity) {
modCount++;
int oldCapacity = elementData.length;
if (minCapacity > oldCapacity) {
Object oldData[] = elementData;
int newCapacity = (oldCapacity * 3)/2 + 1;
if (newCapacity < minCapacity)
newCapacity = minCapacity;
// minCapacity is usually close to size, so this is a win:
elementData = Arrays.copyOf(elementData, newCapacity);
}
}
As that clearly shows there are checks and data is copied and replaced. So to put it simply there is no one answer, the test conditions are likely to affect the results.
Upvotes: -2
Reputation: 13771
My understanding is that the JVM has specific opcodes for working with arrays. It's likely the performance difference is overhead of method calls and such. Why not write a simple test case and use javad
to have a look at what the code compiles down into. That should give you an idea.
Upvotes: 1
Reputation: 1503859
The performance will very much depend on the VM involved, and a variety of other considerations. The blanket statement at the start of that article leads me to suspect that the author has little clue about how performance varies on JVM - and the rest of the test code confirms that. It doesn't test for nearly long enough, nor does it use any JVM warmup period or anything similar. Oh, and it uses INT_ARRAY.length
when testing the ArrayList
version, meaning that one potential source of JIT optimization is removed. Really not a good article.
However, it's fairly easy to consider the things that ArrayList.get() involves beyond the normal array access:
Ultimately though, the performance of one single method call doesn't matter much. What mattes is whether that's significant in your actual use case. Does your application spend most of its time getting individual elements from a collection? Does it do so in the kind of loop shown in the article, which does nothing else and may therefore benefit from additional JIT optimizations in one case or the other?
Microbenchmarking is fun, but you need to be aware of its limitations when it comes to providing you useful information.
Upvotes: 10