Reputation: 367
I wrote a vector type using the Clang SIMD vector extensions. It works well, except when I need to check if two vectors are equal. The ==
operator doesn't seem to be defined correctly for Clang's vector types. Attempting to compare two vectors with ==
bizarrely seems to evaluate to a third vector of the same type as the two being compared, instead of a bool
. I find this odd, since applying other operations like +
or -
compiles with no trouble, and outputs the expected results. Here's my code, compiled with Clang 3.5 (Xcode):
// in vect.h
template <typename NumericType>
using vec2 = NumericType __attribute__((ext_vector_type(2))) ;
//in main.cpp
#include "vect.h"
int main(int argc, const char ** argv) {
vec2<int> v0 {0, 1} ;
vec2<int> v1 {0, 1} ;
vec2<int> sumVs = v0 + v1 ; //OK: evaluates to {0, 2} when run
bool equal = (v0 == v1) ; /* Compiler error with message: "Cannot initialize
a variable of type 'bool' with an rvalue of type 'int __attribute__((ext_vector_type(2)))'" */
return 0;
}
Is there any way to enable using operator ==
with Clang's vector types, or any other workaround to this problem? Since they're considered primitive and not class types, I can't overload a comparison operator myself, and writing a global equals()
function seems kludgy and inelegant.
Update: Or if no one has the solution I'm looking for, perhaps someone could explain the default behavior of the ==
operator when comparing two SIMD vectors?
Update #2: Hurkyl suggested ==
on two vectors does a vectorized comparison. I updated my code to test that possibility:
template <typename NumericType>
using vec3 = NumericType __attribute__((ext_vector_type(3))) ;
int main(int argc, const char ** argv) {
vec3<int> v0 {1, 2, 3} ;
vec3<int> v1 {3, 2, 1} ;
auto compareVs = (v0 == v1) ;
return 0;
}
LLDB reports the value of compareVs
as {0, -1, 0}, which seems almost right if that's what's happening, but it seems weird that true
would be -1, and false
be 0.
Update #3: Ok, so thanks to the feedback I've gotten, I now have a better understanding of how relational and comparison operators are applied to vectors. But my basic problem remains the same. I need a simple and elegant way to check, for any two SIMD-type vectors v1
and v2
, whether they are equivalent. In other words, I need to be able to check that for every index i
in v1
and v2
, v1[i] == v2[i]
, expressed as a single boolean value (that is, not as an vector/array of bool
). If the only answer really is a function like:
template <typename NumericType>
bool equals(vec2<NumericType> v1, vec2<NumericType> v2) ...
... then I'll accept that. But I'm hoping someone can suggest something less clumsy.
Upvotes: 4
Views: 2211
Reputation: 45444
If instead of using the compiler-specific language extensions you use the instrinsics (as provided, for example, in xmmintrin.h
), then you can use
_mm_movemask_ps(__m128)
and its relatives. For example
__m128 a,b;
/* some code to fill a,b with integer elements */
bool a_equals_b = 15 == _mm_movemask_ps(_mm_cmpeq_epi32(a,b));
This code works as follows. First, _mm_cmpeq_ps(a,b)
generates another __m128
with each of the four elements to be either all bits 0 or all bits 1 – I presume operator==
for the compiler-generated vector extensions calls exactly this intrinsic). Next, int _mm_movemask_ps(__m128)
returns an integer with the kth bit set to the signbit of the kth element of its argument. Thus, if a==b
for all elements, then _mm_movemask_ps(_mm_cmpeq_epi32(a,b))
returns 1|2|4|8=15
.
I don't know the compiler-supported language extensions, but if you can obtain the underlying __m128
(for 128 bit wide vectors), then you can use this approach (possibly only the call to _mm_movemask_ps()
).
Upvotes: 4
Reputation: 283763
Using the bitwise-complement of false as a true value isn't that unusual (see BASIC, for example).
It's particularly useful in vector arithmetic if you want to use it to implement to implement a branch-free ternary operator:
r = (a == c)? b: d
becomes
selector = (a == c)
r = (b & selector) | (d & ~selector)
Upvotes: 3