Reputation:
So I have two options, both functions have the identical types:
(Entry->d_type == DT_DIR ? rmdirr : remove)(CurrentEntryPath);
Or
if (Entry->d_type == DT_DIR) {
rmdirr(CurrentEntryPath);
} else {
remove(CurrentEntryPath);
}
I have confirmed that the ternary is %100 percent safe, because both functions are of compatible pointer types. Which one is faster (Even if less readable)?
Upvotes: 0
Views: 156
Reputation: 123578
Rule #0 - Do not think in terms of raw speed; instead, think in terms of "which would I rather fix 8 months from now when someone reports a bug".
Rule #1 - Measure, don't guess, and don't ask people who don't have access to your system to guess. Code up both versions on the target system and profile them - examine the generated machine code, and run each version against a large enough test set to generate usable statistics and analyze the results. Consider how it is used - is it called thousands of times in a tight loop, or is it called once over the lifetime of the program? Each function involves updating the file system, which will take many orders of magnitude more time to execute than deciding which one to call regardless of which method you use.
Rule #2 - It doesn't matter how fast your code is if it gives you the wrong answer, or does the wrong thing, or exposes your credit card information to the world, or blows up if someone in the next room sneezes, or if nobody (including yourself) can fix or update it. Code for correctness first, then for readability and maintainability, then for safety and reliability, and then for speed. Most of your significant speed gains come from using the right algorithm and data structure, not your choice of flow control structure.
Rule #3 - Do not use the ternary operator in place of an if-else
structure just for flow control; that's not its job. While the first version works, it's a bit eye-stabby and hard to read at a glance, and when you pick it back up six months from now you're going to ask yourself why you did that. And I can practically guarantee it won't be measurably faster or slower than the other method.
I'm not saying that speed doesn't matter - I'm saying that speed is only one thing that needs to be considered, and unless you're working in specific domains, it's not the most important thing.
Upvotes: 5
Reputation: 60143
It's very conceivable that an optimizing compiler will generate the same code for the two cases.
Curiously gcc and clang in this case don't do that and instead generate
a code that literally uses function pointers for the :?
case
and direct jumps for the second case.
Example:
#include <dirent.h>
#include <stdio.h>
#include <unistd.h>
int rmitem0(struct dirent const*Entry)
{
return (Entry->d_type == DT_DIR ? rmdir : remove)(Entry->d_name);
}
int rmitem1(struct dirent const*Entry)
{
if (Entry->d_type == DT_DIR)
return rmdir(Entry->d_name);
else return remove(Entry->d_name);
}
x86_64 clang:
0000000000000000 <rmitem0>:
0: 80 7f 12 04 cmp BYTE PTR [rdi+0x12],0x4
4: b8 00 00 00 00 mov eax,0x0 5: R_X86_64_32 rmdir
9: b9 00 00 00 00 mov ecx,0x0 a: R_X86_64_32 remove
e: 48 0f 44 c8 cmove rcx,rax
12: 48 83 c7 13 add rdi,0x13
16: ff e1 jmp rcx
0000000000000018 <rmitem1>:
18: 80 7f 12 04 cmp BYTE PTR [rdi+0x12],0x4
1c: 48 8d 7f 13 lea rdi,[rdi+0x13]
20: 0f 85 00 00 00 00 jne 26 <rmitem1+0xe> 22: R_X86_64_PLT32 remove-0x4
26: e9 00 00 00 00 jmp 2b <rmitem1+0x13> 27: R_X86_64_PLT32 rmdir-0x4
x86_64 gcc:
0000000000000000 <rmitem0>:
0: 80 7f 12 04 cmp BYTE PTR [rdi+0x12],0x4
4: 74 09 je f <rmitem0+0xf>
6: 48 8b 05 00 00 00 00 mov rax,QWORD PTR [rip+0x0] # d <rmitem0+0xd> 9: R_X86_64_REX_GOTPCRELX remove-0x4
d: eb 07 jmp 16 <rmitem0+0x16>
f: 48 8b 05 00 00 00 00 mov rax,QWORD PTR [rip+0x0] # 16 <rmitem0+0x16> 12: R_X86_64_REX_GOTPCRELX rmdir-0x4
16: 48 83 c7 13 add rdi,0x13
1a: ff e0 jmp rax
000000000000001c <rmitem1>:
1c: 4c 8d 47 13 lea r8,[rdi+0x13]
20: 80 7f 12 04 cmp BYTE PTR [rdi+0x12],0x4
24: 4c 89 c7 mov rdi,r8
27: 75 05 jne 2e <rmitem1+0x12>
29: e9 00 00 00 00 jmp 2e <rmitem1+0x12> 2a: R_X86_64_PLT32 rmdir-0x4
2e: e9 00 00 00 00 jmp 33 <rmitem1+0x17> 2f: R_X86_64_PLT32 remove-0x4
These two strategies should therefore have slightly different performance characteristics here, but in any case you're missing the forest for a tiny tree.
I've measured duration of a rmdir
to be about 14µs on Linux.
The conditionals above should take about a fraction of a ns, a few ns at most: that's over 10,000 times faster than your bottleneck.
Upvotes: 2
Reputation: 68013
It is very difficult to judge what is actually more efficient. The if-else produces less instructions but there is a branch instruction requiring pipeline flush if the branch prediction is not met.
#define SOMEVALUE 5
int __attribute__((noinline)) foo(int x)
{
return rand();
}
int __attribute__((noinline)) boo(int x)
{
return rand();
}
int aaa(int x)
{
int result;
if(x == 5)
result = foo(x);
else
result = boo(x);
return result;
}
int bbb(int x)
{
int result;
return (x == 5 ? foo : boo)(x);
}
int (*z[2])(int) = {foo, boo};
int ccc(int x)
{
return z[!!(x == 5)](x);
}
and the resulting code:
foo:
jmp rand
boo:
jmp rand
aaa:
cmp edi, 5
je .L6
jmp boo
.L6:
jmp foo
bbb:
cmp edi, 5
mov eax, OFFSET FLAT:foo
mov edx, OFFSET FLAT:boo
cmovne rax, rdx
jmp rax
ccc:
xor eax, eax
cmp edi, 5
sete al
jmp [QWORD PTR z[0+rax*8]]
z:
.quad foo
.quad boo
In my opinion if you do such a microoptimization in the less trivial code - you need to see the produced code and decide what is more efficient.
Upvotes: 4