Reputation: 43527
I have some code that I'd like to run the fast built-in CPU instruction popcnt
(when __builtin_popcountll
is compiled using proper flags such as with g++ -mpopcnt
or clang++ -march=corei7
, this happens), but also be able to fall-back to code when cpuid
reveals a CPU not supporting the HW instruction.
Of course, to get the fall-back code that I trust the compiler folks have implemented right (so I don't have to bring in C or asm code to do my popcount) I need a separate compilation unit that is compiled without the -mpopcnt
or -march=corei7
flags.
Is linking together separately compiled code the only way? Are there no compiler intrinsics or other types of hints or other built-ins I don't know about that I can use to have it generate the fallback popcount code?
Upvotes: 5
Views: 3410
Reputation: 21
gcc has a feature called "multiversioning" specifically for this.
Upvotes: 2
Reputation: 2859
You could call the "fall back code" directly. I believe it's accessible of the standard libraries as:
int __popcountsi2 (int a)
int __popcountdi2 (long a)
int __popcountti2 (long long a)
Upvotes: 0
Reputation: 3741
I don't know for sure, but the cost of putting in code necessary to select between the popcnt instruction and a fallback implementation might have a larger performance hit than simply going with a non-popcnt implementation all of the time.
To switch in an alternate implementation (doing the switching at the site of the popcnt), you will need at least the following:
I suspect that the cost prohibits an effective implementation of the intrinsic that you describe.
Upvotes: 2