Reputation: 683
I stumbled upon a strange duration difference between two equivalent increment syntax.
if(seed != state) ++i;
this notation measures 2.25 ms per 1048576 iterations
i += (seed != state);
and this 2.80 ms per 1048576 iterations.
Shouldn't the second notation be a bit faster than the first? It is to mention, that seed == state happens very rarely (in 1 of 2^32-1 times).
Thanks for your answers.
Edit: I tested the same thing with the gcc C compiler and there the first was slightly faster than the second, but the second was the same speed as with the C++ compiler.
Upvotes: 1
Views: 92
Reputation: 57688
With the if
statement, the compiler can translate the increment to a conditionally executed statement (at least with processors that support conditionally executed statements).
The second example will always perform an addition of 1 or zero.
This is micro-optimization and really depends on the processor and its support system (caches, branch prediction, etc.) For example, the 2nd example may be faster because there is no decision jump. The first example may be faster on processors that have branch prediction. The difference between the two may be negligible on processors that can fit the code fragment into the instruction cache (and not need to fetch other instructions).
I'm surprised that the code execution is in milliseconds. Most modern processors should be executing in nanoseconds for those examples.
Upvotes: 1
Reputation: 1062
The if
version incurs a conditional branch instruction. The other just promotes a bool to an int and adds it.
Edit:
I would lean toward using the first one, since technically the conversion (int)true
isn't required by the standard to result in 1
; it's only required to result in "not zero." Though in practice, I've never seen a bool
to int
conversion that didn't use 1
for true
.
Upvotes: 3
Reputation: 2763
The first one could conditionally not perform the add as a result of branch prediction. The second one is going to add every cycle because no option to skip the add is available. The result of the comparison will always be added to i.
I would assume that you will find that the first one is not performing the add every loop, but only occasionally when branch prediction fails.
Upvotes: 0
Reputation: 36346
You never know what your compiler does to optimize your code. Branch prediction would actually make the first one faster. The second one depends on the comparison actually being carried out, and the result being added to i
as 1 if true (which, depending on your CPU, but likely) will introduce a "dummy" 1-loaded register.
Upvotes: 2