Katona
Katona

Reputation: 4901

Performance of 2D array allocation

I am wondering why allocation of a 2D int array at once (new int[50][2]) performs poorer than allocating separately, that is, execute new int[50][] first, then new int[2] one-by-one. Here is a non-professional benchmark code:

public class AllocationSpeed {

    private static final int ITERATION_COUNT = 1000000;

    public static void main(String[] args) {
        new AllocationSpeed().run();
    }

    private void run() {
        measureSeparateAllocation();
        measureAllocationAtOnce();
    }

    private void measureAllocationAtOnce() {
        Stopwatch stopwatch = Stopwatch.createStarted();
        for (int i = 0; i < ITERATION_COUNT; i++) {
            allocateAtOnce();
        }
        stopwatch.stop();
        System.out.println("Allocate at once: " + stopwatch);
    }

    private int allocateAtOnce() {
        int[][] array = new int[50][2];
        return array[10][1];
    }

    private void measureSeparateAllocation() {
        Stopwatch stopwatch = Stopwatch.createStarted();
        for (int i = 0; i < ITERATION_COUNT; i++) {
            allocateSeparately();
        }
        stopwatch.stop();
        System.out.println("Separate allocation: " + stopwatch);
    }

    private int allocateSeparately() {
        int[][] array = new int[50][];
        for (int i = 0; i < array.length; i++) {
            array[i] = new int[2];
        }
        return array[10][1];
    }
}

I tested on 64 bit linux, these are the results with different 64 bit oracle java versions:

1.6.0_45-b06:

Separate allocation: 401.0 ms
Allocate at once: 1.673 s

1.7.0_45-b18

Separate allocation: 408.7 ms
Allocate at once: 1.448 s

1.8.0-ea-b115

Separate allocation: 380.0 ms
Allocate at once: 1.251 s

Just for curiosity, I tried it with OpenJDK 7 as well (where the difference is smaller):

Separate allocation: 424.3 ms
Allocate at once: 1.072 s

For me it's quite counter-intuitive, I would expect allocating at once to be faster.

Upvotes: 7

Views: 218

Answers (2)

Pr0methean
Pr0methean

Reputation: 315

The latter code's inner loop (with a newarray) is hit more times than the former code's multianewarray, so it probably hits C2 and gets subjected to escape analysis sooner. (Once that happens, the rows created by the latter code are allocated on the stack, which is faster than the heap and reduces the workload for the garbage collector.)

It's also possible that these JDK versions didn't actually do escape analysis on rows from a multianewarray, since a multidimensional array is more likely to exceed the size limit for a stack array.

Upvotes: 0

Joop Eggen
Joop Eggen

Reputation: 109547

Absolute unbelievable. A benchmark source might suffer from optimizations, gc and JIT, but this?

Looking at the java byte code instruction set:

  • anewarray (+ 2 bytes indirect class index) for arrays of object classes (a = address)
  • newarray (+ 1 byte for prinitive class) for arrays of primitive types
  • multianewarray (+ 2 bytes indirect class index) for multidimensional arrays

This leads one to suspect that multianewarray is suboptimal for primitive types.

Before looking further, I hope someone knows where we are misled.

Upvotes: 1

Related Questions