AdamJames
AdamJames

Reputation: 367

Clang vector extensions and the equality operator in C++

I wrote a vector type using the Clang SIMD vector extensions. It works well, except when I need to check if two vectors are equal. The == operator doesn't seem to be defined correctly for Clang's vector types. Attempting to compare two vectors with == bizarrely seems to evaluate to a third vector of the same type as the two being compared, instead of a bool. I find this odd, since applying other operations like + or - compiles with no trouble, and outputs the expected results. Here's my code, compiled with Clang 3.5 (Xcode):

// in vect.h 
template <typename NumericType>
using vec2 = NumericType __attribute__((ext_vector_type(2))) ;

//in main.cpp
#include "vect.h"

int main(int argc, const char ** argv) {

    vec2<int> v0 {0, 1} ;
    vec2<int> v1 {0, 1} ;

    vec2<int> sumVs = v0 + v1 ; //OK: evaluates to {0, 2} when run

    bool equal = (v0 == v1) ; /* Compiler error with message: "Cannot initialize
        a variable of type 'bool' with an rvalue of type 'int __attribute__((ext_vector_type(2)))'" */

    return 0;
}

Is there any way to enable using operator == with Clang's vector types, or any other workaround to this problem? Since they're considered primitive and not class types, I can't overload a comparison operator myself, and writing a global equals() function seems kludgy and inelegant.

Update: Or if no one has the solution I'm looking for, perhaps someone could explain the default behavior of the == operator when comparing two SIMD vectors?

Update #2: Hurkyl suggested == on two vectors does a vectorized comparison. I updated my code to test that possibility:

template <typename NumericType>
using vec3 = NumericType __attribute__((ext_vector_type(3))) ;

int main(int argc, const char ** argv) {

    vec3<int> v0 {1, 2, 3} ;
    vec3<int> v1 {3, 2, 1} ;

    auto compareVs = (v0 == v1) ;

    return 0;
}

LLDB reports the value of compareVs as {0, -1, 0}, which seems almost right if that's what's happening, but it seems weird that true would be -1, and false be 0.

Update #3: Ok, so thanks to the feedback I've gotten, I now have a better understanding of how relational and comparison operators are applied to vectors. But my basic problem remains the same. I need a simple and elegant way to check, for any two SIMD-type vectors v1 and v2, whether they are equivalent. In other words, I need to be able to check that for every index i in v1 and v2, v1[i] == v2[i], expressed as a single boolean value (that is, not as an vector/array of bool). If the only answer really is a function like:

template <typename NumericType>
bool equals(vec2<NumericType> v1, vec2<NumericType> v2) ...

... then I'll accept that. But I'm hoping someone can suggest something less clumsy.

Upvotes: 4

Views: 2211

Answers (2)

Walter
Walter

Reputation: 45444

If instead of using the compiler-specific language extensions you use the instrinsics (as provided, for example, in xmmintrin.h), then you can use _mm_movemask_ps(__m128) and its relatives. For example

__m128 a,b;
/* some code to fill a,b with integer elements */
bool a_equals_b = 15 == _mm_movemask_ps(_mm_cmpeq_epi32(a,b));

This code works as follows. First, _mm_cmpeq_ps(a,b) generates another __m128 with each of the four elements to be either all bits 0 or all bits 1 – I presume operator== for the compiler-generated vector extensions calls exactly this intrinsic). Next, int _mm_movemask_ps(__m128) returns an integer with the kth bit set to the signbit of the kth element of its argument. Thus, if a==b for all elements, then _mm_movemask_ps(_mm_cmpeq_epi32(a,b)) returns 1|2|4|8=15.

I don't know the compiler-supported language extensions, but if you can obtain the underlying __m128 (for 128 bit wide vectors), then you can use this approach (possibly only the call to _mm_movemask_ps()).

Upvotes: 4

Ben Voigt
Ben Voigt

Reputation: 283763

Using the bitwise-complement of false as a true value isn't that unusual (see BASIC, for example).

It's particularly useful in vector arithmetic if you want to use it to implement to implement a branch-free ternary operator:

r = (a == c)? b: d

becomes

selector = (a == c)
r = (b & selector) | (d & ~selector)

Upvotes: 3

Related Questions