Reputation: 24806
I would like to use the _bzhi_u32
intrinsic but I don't want to use the -mbmi2
flag since that makes gcc
to use other BMI2 instruction (notably SHLX
in many <<
shifts) which will produce SIGILL
(Illegal instruction) if the host where the executable runs doesn't not support BMI2.
I only use _bzhi_u32
in one function and I guard it's use by checking at runtime that is supported via _builtin_cpu_is("corei7")
defaulting to another implementation if not supported. But I cannot guard the other BMI2 instruction that gcc inserts when -mbmi2
is used.
The problem is that the _bzhi_u32
intrinsic won't be defined in x86intrin.h
unless -mbmi2
is specified (with the undesired effect of gcc sprinkling SHLX
all over the place).
Upvotes: 4
Views: 878
Reputation: 24806
There are two possible alternatives to avoid specifying -mbmi2
globally
x86intrin.h
and declare the function use _bzhi_u32
with __attribute__((target ("bmi2")))
. That way gcc will generate BMI2 instruction on that function. This doesn't work on 4.8 and lower (_bzhi_u32
is not defined unless __BMI2__
is set and even if it is the linker will complain with undefined reference to '_bzhi_u32'
). .c
file and put #pragma GCC target "bmi2"
at the top. This defines __BMI2__
and enables BMI2 instruction generation for this translation unit only.-mbmi2
just that file (which is equivalent to the #pragma GCC target
option. Options 2 and 3 limits your inline
and static
options. Option 1 is the way to go if you are using GCC 4.9 or higher.
Upvotes: 3
Reputation: 3917
Instead of using the intrinsic, it may be easier to embed the assembler code...
uint32_t val, i;
asm ("bzhi %0,%1,%2" : "=r"(val) : "r"(val), "r"(i) : );
Upvotes: 1
Reputation: 7925
Quote from gcc 4.9 release notes:
It is now possible to call x86 intrinsics from select functions in a file that are tagged with the corresponding target attribute without having to compile the entire file with the -mxxx option. This improves the usability of x86 intrinsics and is particularly useful when doing Function Multiversioning.
Upvotes: 2