Reputation: 31
If I had a loop that checks for a specific value in an array and, for some reason, must iterate over all elements and cannot break midway.
Which of the following would be more efficient: blindly setting a flag on each match, or checking if the flag is false before setting it.
bool happened = false;
while (...) {
if (...) {
happened = true;
}
}
vs
bool happened = false;
while (...) {
if (...) {
if (!happened) happened = true;
}
}
As far as I can tell, both are more or less equivalent on the assumption that memory reads are as fast as memory writes (ignoring the extra instruction in the second example). Am I correct in my conclusions?
Upvotes: 2
Views: 134
Reputation:
Most answers so far seem to be "it depends", but I think it's pretty obvious:
If you conditionally set a value to something if it isn't that something already, it is logically identical to unconditionally setting the value. If you're lucky, the compiler will notice and treat both identically, but if it doesn't, the unconditional version wins every time.
1: It uses fewer instructions
2: The extra instructions are conditional, hurting branch prediction
If you're going with
if (cond) varx = vary;
The compiler uses one conditional branch (possibly a conditional move instead of a branch, if it is supported in the hardware)
If you're going with
if (cond && varx != vary) varx = vary;
The compiler will either simplify to the first case, or use two conditional jumps (or one jump and a conditional move).
Upvotes: 1
Reputation: 4387
The compiler will make the decision for you, if you use any meaningful optimization. Write whatever is cleanest and makes sense. To me that would be the first one, since it is less code and doesn't introduce more code paths. For fun, I did some tests in Clang 3.4 -O3:
bool happened = false;
extern volatile int dontOptMe1, dontOptMe2, dontOptMe3;
while (dontOptMe1) {
if (dontOptMe2) {
happened = true;
}
}
dontOptMe3 = happened;
vs
bool happened = false;
extern volatile int dontOptMe1, dontOptMe2, dontOptMe3;
while (dontOptMe1) {
if (dontOptMe2) {
if(!happened) happened = true;
}
}
dontOptMe3 = happened;
Resulted in the following in pseudo ASM:
MOV happened, 0
BRA LOOP_END
LOOP_START:
SELECTEQ dontOptMe2, 0, happened, happened, 1
LOOP_END:
BCZC dontOptMe1, LOOP_START
EXIT:
STORE dontOptMe3, happened
vs
MOV happened, 0
BCZS dontOptMe1, EXIT
LOOP:
SELECTNE dontOptMe2, 0, R2, 1, 0
SELECTEQ happened, 0, R3, 1, 0
AND R3, R2, R3
SELECTNE R3, 0, happened, 1, 0
BCZC dontOptMe1, LOOP
EXIT:
STORE dontOptMe3, happened
The first is much more desirable. This is also a good example of how restrictive volatile types are. I was surprised the compiler couldn't transform the second into the first.
Note: SELECTXX means, if Arg1 minus Arg2 sets condition code XX, set Arg3 to Arg4, otherwise set Arg3 to Arg5. So: SELECTNE dontOptMe2, 0, R2, 1, 0
is the equivalent of: R2 = (dontOptMe2 == 0) ? 1 : 0;
in C
Upvotes: 4
Reputation: 246
I have run it empirically on the following code:
bool happened = false;
for (int i = 0; i < 1000000000; i++) {
if (i % 2) {
// uncomment the one you want to use
// happened = true;
// if (!happened) happened = true;
}
}
I ran it 10 times on each, getting a mean of 2.304s with standard deviation 0.013s on the first and a mean of 2.399s with standard deviation 0.007s on the second. A two-sample t-test shows that happened = true;
is faster than if (!happened) happened = true;
and the difference is statistically significant with with p = 7e-12.
Upvotes: 0
Reputation: 15121
Generally speaking, first version is more pipeline friendly, because its instruction stream is less disturbed by jump, therefore is more efficient. But it is depends on specific architecture features and compiler optimizations.
I believe the performance difference of these two versions is unnoticeable in actual situations.
Upvotes: 3
Reputation: 91119
Yes, you are quite correct. They are more or less the same.
To be exact, the first form might be preferrable if the "happened" happens quite often, while the second one might want to be favourable if the "happend" is quite rare. But even then, I don't think the second one is better in any way.
In the end, it is probably a micro-optimization where it is better to go for readability instead of performance.
Upvotes: 0
Reputation: 34423
Only profiling will tell for sure, but most likely the variable is stored in the register anyway, not in memory (unless you have many other local variables in the loop) and there will be no measurable difference at all.
Upvotes: 0