Reputation:
Please compare two ways of setting/returning an array:
static public float[] test_arr_speeds_1( int a ) {
return new float[]{ a, a + 1, a + 2, a + 3, a + 4, a + 5,
a + 6, a + 7, a + 8, a + 9 };
} // or e.g. field = new float... in method
static public float[] test_arr_speeds_2( int a ) {
float[] ret = new float[10];
ret[0] = a;
ret[1] = a + 1;
ret[2] = a + 2;
ret[3] = a + 3;
ret[4] = a + 4;
ret[5] = a + 5;
ret[6] = a + 6;
ret[7] = a + 7;
ret[8] = a + 8;
ret[9] = a + 9;
return ret;
} // or e.g. field[0] = ... in method
Both generate distinct bytecodes and both can be decompiled to their former state. After checking the execution times via profiler (100M iterations, unbiased, different environs), the time of _1 method is approx. 4/3 the time of _2, even though both create a new array and both set every field to a given value. The times are negligible most of the time, but this still bugs me - why is _1 visibly slower? Can anybody check/confirm/explain it to me in a reasonable, JVM-supported way?
Upvotes: 5
Views: 165
Reputation: 340873
Here is the difference between bytecode (only for first two items). First method:
bipush 10
newarray float //creating an array with reference on operand stack
dup
iconst_0
iload_0
i2f
fastore //setting first element
dup
iconst_1
iload_0
iconst_1
iadd
i2f
fastore //setting second element
//...
areturn //returning the top of the operand stack
Second method:
bipush 10
newarray float
astore_1 //creating an array and storing it in local variable
aload_1
iconst_0
iload_0
i2f
fastore //setting first element
aload_1
iconst_1
iload_0
iconst_1
iadd
i2f
fastore //setting second element
//...
aload_1
areturn
As you can see the only difference is that the array reference is kept on operand stack in the first scenario (that's why dup
appears so many times - to avoid loosing a reference to an array after fastore
) while in the second scenario the array reference is kept on normal stack (where method arguments and local variables are kept). In this scenario the reference must be read all the time (aload_1
) because fastore
requires arrayref to be on on the operand stack.
We shouldn't make assumptions based on this bytecode - after all it is translated to CPU instructions by jit and most likely in both cases array reference is stored in one of the CPU registers. Otherwise the performance difference would be huge.
If you can measure the difference and you are doing so low-level optimizations - pick the version that is faster. But I doubt the difference is "portable" (depending on the architecture and JVM version/implementation you will observer different timing behaviour). That being said - I would go for more readable version, rather than the one that happens to be faster on your computer.
Upvotes: 6