Reputation: 121
I'm playing with some piece of code calculating the time needed to compute some Java code to get a feeling of the efficiency or inefficiency of some of Java's functionality. Doing so I'm stuck now with some really strange effect I just can't explain myself. Maybe someone of you can help me understand it.
public class PerformanceCheck {
public static void main(String[] args) {
List<PerformanceCheck> removeList = new LinkedList<PerformanceCheck>();
int maxTimes = 1000000000;
for (int i=0;i<10;i++) {
long time = System.currentTimeMillis();
for (int times=0;times<maxTimes;times++) {
// PERFORMANCE CHECK BLOCK START
if (removeList.size() > 0) {
testFunc(3);
}
// PERFORMANCE CHECK BLOCK END
}
long timeNow = System.currentTimeMillis();
System.out.println("time: " + (timeNow - time));
}
}
private static boolean testFunc(int test) {
return 5 > test;
}
}
Starting this results in a relatively long computation time (remember removeList is empty, so testFunc is not even called):
time: 2328
time: 2223
...
While replacing anything of the combination of removeList.size() > 0 and testFunc(3) with anything else has better results. For example:
...
if (removeList.size() == 0) {
testFunc(3);
}
...
Results in (testFunc is called every single time):
time: 8
time: 7
time: 0
time: 0
Even calling both functions independent from each other results in the lower computation time:
...
if (removeList.size() == 0);
testFunc(3);
...
Result:
time: 6
time: 5
time: 0
time: 0
...
Only this particular combination in my initial example takes so long. This is irritating me and I'd really like to understand it. What's so special about it?
Thanks.
Addition:
Changing testFunc() in the first example
if (removeList.size() > 0) {
testFunc(times);
}
to something else, like
private static int testFunc2(int test) {
return 5*test;
}
Will result in being fast again.
Upvotes: 12
Views: 1183
Reputation: 533870
The times are unrealistically fast per iteration. This means the JIT has detected that your code doesn't do anything and has eliminated it. Subtle changes can confuse the JIT and it can't determine the code doesn't do anything and it takes some time.
If you change the test to do something marginally useful, the difference will disappear.
Upvotes: 1
Reputation: 2812
While not directly related to this question, this is how you would correctly micro benchmark the code using Caliper. Below is a modified version of your code so that it will run with Caliper. The inner loops had to be modified some so that the VM will not optimize them out. It is surprisingly smart at realizing nothing was happening.
There are also a lot of nuances when benchmarking Java code. I wrote about some of the issues I ran into at Java Matrix Benchmark, such as how past history can effect current results. You will avoid many of those issues by using Caliper.
Benchmarking issues with Java Matrix Benchmark
public class PerformanceCheck extends SimpleBenchmark {
public int timeFirstCase(int reps) {
List<PerformanceCheck> removeList = new LinkedList<PerformanceCheck>();
removeList.add( new PerformanceCheck());
int ret = 0;
for( int i = 0; i < reps; i++ ) {
if (removeList.size() > 0) {
if( testFunc(i) )
ret++;
}
}
return ret;
}
public int timeSecondCase(int reps) {
List<PerformanceCheck> removeList = new LinkedList<PerformanceCheck>();
removeList.add( new PerformanceCheck());
int ret = 0;
for( int i = 0; i < reps; i++ ) {
if (removeList.size() == 0) {
if( testFunc(i) )
ret++;
}
}
return ret;
}
private static boolean testFunc(int test) {
return 5 > test;
}
public static void main(String[] args) {
Runner.main(PerformanceCheck.class, args);
}
}
OUTPUT:
0% Scenario{vm=java, trial=0, benchmark=FirstCase} 0.60 ns; σ=0.00 ns @ 3 trials
50% Scenario{vm=java, trial=0, benchmark=SecondCase} 1.92 ns; σ=0.22 ns @ 10 trials
benchmark ns linear runtime
FirstCase 0.598 =========
SecondCase 1.925 ==============================
vm: java
trial: 0
Upvotes: 2
Reputation: 30042
When the runtime compiler can figure out testFunc
evaluates to a constant, I believe it does not evaluate the loop, which explains the speedup.
When the condition is removeList.size() == 0
the function testFunc(3)
gets evaluated to a constant. When the condition is removeList.size() != 0
the inner code never gets evaluated so it can't be sped-up. You can modify your code as follows:
for (int times = 0; times < maxTimes; times++) {
testFunc(); // Removing this call makes the code slow again!
if (removeList.size() != 0) {
testFunc();
}
}
private static boolean testFunc() {
return testFunc(3);
}
When testFunc()
is not initially called, the runtime compiler does not realize that testFunc()
evaluates to a constant, so it cannot optimize the loop.
Certain functions like
private static int testFunc2(int test) {
return 5*test;
}
the compiler likely tries to pre-optimize (before execution), but apparently not for the case of an parameter is passed in as an integer and evaluated in a conditional.
Your benchmark returns times like
time: 107
time: 106
time: 0
time: 0
...
suggesting that it takes 2 iterations of the outer-loop for the runtime compiler to finish optimizing. Compiling with the -server
flag would probably return all 0's in the benchmark.
Upvotes: 1
Reputation: 1281
Well, I am glad not having to deal with Java performance optimizations. I tried it myself with Java JDK 7 64-Bit. The results are arbitrary ;). It makes no difference which lists I am using or if I cache the result of size() before entering the loop. Also entirely wiping out the test function makes almost no difference (so it can't be a branch prediction hit either). Optimization flags improve performance but are as arbitrary.
The only logical consequence here is that the JIT compiler sometimes is able to optimize away the statement (which is not that hard to be true), but it seems rather arbitrary. One of the many reasons why I prefer languages like C++, where the behaviour is at least deterministic, even if it is sometimes arbitrary.
BTW in the latest Eclipse, like it always was on Windows, running this code via IDE "Run" (no debug) is 10 times slower than running it from console, so much about that...
Upvotes: 1
Reputation: 66891
That is really surprising. The generated bytecode is identical except for the conditional, which is ifle
vs ifne
.
The results are much more sensible if you turn off the JIT with -Xint
. The second version is 2x slower. So it's to do with what the JIT optimization.
I assume that it can optimize out the check in the second case but not the first (for whatever reason). Even though it means it does the work of the function, missing that conditional makes things much faster. It avoids pipeline stalls and all that.
Upvotes: 3
Reputation: 15729
These benchmarks are tough since compilers are so darned smart. One guess: Since the result of testFunc() is ignored, the compiler might be completely optimizing it out. Add a counter, something like
if (testFunc(3))
counter++;
And, just for thoroughness, do a System.out.println(counter)
at the end.
Upvotes: 0