Reputation: 64875
I would like to compare two vectors of doubles based on their absolute values.
That is, the vector equivalent of the following:
if (fabs(x) < fabs(y)) {
...
}
Is there anything better than just taking the absolute value of each side and following up with a _mm256_cmp_pd
?
Interested in all of AVX, AVX2, and AVX-512 flavors.
Upvotes: 5
Views: 984
Reputation: 3930
With AVX-512 you can save one µop. Instead of 2xvandpd
+vcmppd
you can use
vpternlogq
+vpcmpuq
. Note that the solution below assumes that the numbers are
not a NaN
.
IEEE-754 floating point numbers have the nice property that they are encoded
such that if x[62:0]
integer_less_than y[62:0]
, then as a floating point:
abs(x)<abs(y)
.
So, instead of setting both sign bits to 0
, we can copy the sign bit of x
to the sign bit of y
and compare the result as an unsigned integer.
In the (untested) code below, for negative x
both xi[63]
and yi_sgnx[63]
are 1
,
while for positive x
, both xi[63]
and yi_sgnx[63]
are 0
.
So the unsigned integer compare actually compares xi[62:0]
with yi[62:0]
, which is just what we need for the comparison abs(x)<abs(y)
.
The vpternlog
instruction is suitable for copying the sign bit, see here or here.
I'm not sure if the constants z
and 0xCA
are chosen correctly.
__mmask8 cmplt_via_ternlog(__m512d x, __m512d y){
__m512i xi = _mm512_castpd_si512(x);
__m512i yi = _mm512_castpd_si512(x);
__m512i z = _mm512_set1_epi64(0x7FFFFFFFFFFFFFFFull);
__m512i yi_sgnx = _mm512_ternarylogic_epi64(z, yi, xi, 0xCA);
return _mm512_cmp_epu64_mask(xi, yi_sgnx, 1); /* _CMPINT_LT */
}
Upvotes: 3