JBE
JBE

Reputation: 12597

Performance test independent of the number of iterations

Trying to answer to this ticket : What is the difference between instanceof and Class.isAssignableFrom(...)?

I made a performance test :

class A{}
class B extends A{}

A b = new B();

void execute(){
  boolean test = A.class.isAssignableFrom(b.getClass());
  // boolean test = A.class.isInstance(b);
  // boolean test = b instanceof A;
}

@Test
public void testPerf() {
  // Warmup the code
  for (int i = 0; i < 100; ++i)
    execute();

  // Time it
  int count = 100000;
  final long start = System.nanoTime();
  for(int i=0; i<count; i++){
     execute();
  }
  final long elapsed = System.nanoTime() - start;
System.out.println(count+" iterations took " + TimeUnit.NANOSECONDS.toMillis(elapsed) + "ms.);
}

Which gave me :

But playing with the number of iterations, I can see the performance is constant. For Integer.MAX_VALUE :

Thinking it was a compiler optimization (I ran this test with JUnit), I changed it into this :

@Test
public void testPerf() {
    boolean test = false;

    // Warmup the code
    for (int i = 0; i < 100; ++i)
        test |= b instanceof A;

    // Time it
    int count = Integer.MAX_VALUE;
    final long start = System.nanoTime();
    for(int i=0; i<count; i++){
        test |= b instanceof A;
    }
    final long elapsed = System.nanoTime() - start;
    System.out.println(count+" iterations took " + TimeUnit.NANOSECONDS.toMillis(elapsed) + "ms. AVG= " + TimeUnit.NANOSECONDS.toMillis(elapsed/count));

    System.out.println(test);
}

But the performance is still "independent" of the number of iterations. Could someone explain that behavior ?

Upvotes: 3

Views: 385

Answers (3)

Marko Topolnik
Marko Topolnik

Reputation: 200168

  1. A hundred iterations is not nearly enough for warmup. The default compile threshold is 10000 iterations (a hundred times more), so best go at least a bit over that threshold.
  2. Once the compilation has been triggered, the world is not stopped; the compilation takes place in the background. That means that its effect will start being observable only after a slight delay.
  3. There is ample space for optimization of your test in such a way that the entire loop is collapsed into its final result. That would explain the constant numbers.

Anyway, I always do the benchmarks by having an outer method call the inner method something like 10 times. The inner method does a big number of iterations, say 10,000 or more, as needed to make its runtime rise into at least tens of milliseconds. I don't even bother with nanoTime since if microsecond precision is important to you, it is just a sign of measuring too short a time interval.

When you do it like this, you are making it easy for the JIT to execute a compiled version of the inner method after it was substituted for the interpreted version. Another benefit is that you get assurance that the times of the inner method are stabilizing.

Upvotes: 5

Peter Lawrey
Peter Lawrey

Reputation: 533530

The JIT compiler can eliminate loops which don't anything. This can be triggered after 10,000 iterations.

What I suspect you are timing is how long it takes for the JIT to detect that the loop doesn't do anything and remove it. This will be a little longer than it takes to do 10,000 iterations.

Upvotes: 1

Valentin Rocher
Valentin Rocher

Reputation: 11669

If you want to make a real benchmark of a simple function, you should use a micro-benchmarking tool, like Caliper. It will be much simpler that trying to make your own benchmark.

Upvotes: 3

Related Questions