Reputation: 33817
You have two arrays and a function that counts differences between them:
for( i = 0; i < len; ++i ) {
int value1 = vector1[i];
int value2 = vector2[i];
if( value1 != value2 ) ++num_differences;
}
As branching downgrades performance, it can be optimized to:
for( i = 0; i < len; ++i ) {
num_differences += !!(vector1[i] != vector2[i])
}
// !!(..) is to be sure that the result is boolean 0 or 1
so there is no if
clause. But does it practically make sense? With GCC (and other compilers) being so smart, does it make sense to play with such optimizations?
Upvotes: 2
Views: 132
Reputation: 71506
I dont think you are going to do much better, your second example is hard to read/understand for the average programmer which means two things one hard to understand and maintain, two you may be creeping into dark, less tested/supported, corners of the compiler. Drive down the road between the lines, dont wander about on the shoulder or in the wrong lane.
Go with this
for( i = 0; i < len; ++i ) {
int value1 = vector1[i];
int value2 = vector2[i];
if( value1 != value2 ) ++num_differences;
}
or this
for( i = 0; i < len; ++i ) {
if( vector1[i] != vector2[i] ) ++num_differences;
}
if it really is bothering you and you have properly concluded this is your performance bottleneck then time the difference between them. From the disassembly shown, and the nature of this platform, it is very difficult to properly time such things and draw the right conclusions. Too many caches, and other factors that cloud over the results, leading to false conclusions, etc. and no two x86 implementations have the same performance so if you happen to tune for your computer you are likely detuning it for another model of x86 or even the same make on a different motherboard with different I/O characteristics.
Upvotes: 1
Reputation: 258548
Unless len
is several millions large or you're comparing a lot of arrays, then no. The second version is less readable (not so much to an experienced programmer), so I'd prefer the first variant, unless this is the bottleneck (doubtful).
The following codes are generated, with optimizations:
for( i = 0; i < 4; ++i ) {
int value1 = vector1[i];
int value2 = vector2[i];
if( value1 != value2 ) ++num_differences;
00401000 mov ecx,dword ptr [vector1 (40301Ch)]
00401006 xor eax,eax
00401008 cmp ecx,dword ptr [vector2 (40302Ch)]
0040100E je wmain+15h (401015h)
00401010 mov eax,1
00401015 mov edx,dword ptr [vector1+4 (403020h)]
0040101B cmp edx,dword ptr [vector2+4 (403030h)]
00401021 je wmain+26h (401026h)
00401023 add eax,1
00401026 mov ecx,dword ptr [vector1+8 (403024h)]
0040102C cmp ecx,dword ptr [vector2+8 (403034h)]
00401032 je wmain+37h (401037h)
00401034 add eax,1
00401037 mov edx,dword ptr [vector1+0Ch (403028h)]
0040103D cmp edx,dword ptr [vector2+0Ch (403038h)]
00401043 je wmain+48h (401048h)
00401045 add eax,1
}
for( i = 0; i < 4; ++i ) {
num_differences += !!(vector1[i] != vector2[i]);
00401064 mov edx,dword ptr [vector1+0Ch (403028h)]
0040106A xor eax,eax
0040106C cmp edx,dword ptr [vector2+0Ch (403038h)]
00401072 mov edx,dword ptr [vector1+8 (403024h)]
00401078 setne al
0040107B xor ecx,ecx
0040107D cmp edx,dword ptr [vector2+8 (403034h)]
00401083 mov edx,dword ptr [vector1+4 (403020h)]
00401089 setne cl
0040108C add eax,ecx
0040108E xor ecx,ecx
00401090 cmp edx,dword ptr [vector2+4 (403030h)]
00401096 mov edx,dword ptr [vector1 (40301Ch)]
0040109C setne cl
0040109F add eax,ecx
004010A1 xor ecx,ecx
004010A3 cmp edx,dword ptr [vector2 (40302Ch)]
004010A9 setne cl
004010AC add eax,ecx
}
So, actually, the second version is slightly slower (theoretically). 19 instructions for the second vs. 17 for the first.
Upvotes: 2
Reputation: 32576
The short answer is: "Trust your Compiler".
In general you're not going to see much benefit from optimisations like this unless you're working with really huge datasets. Even then you really need to benchmark the code to see if there is any improvement.
Upvotes: 4
Reputation: 104698
You should compare the code the compiler generates. It may be equivalent.
The compiler's very smart, but a good engineer can certainly improve a program's performance.
Upvotes: 1