Reputation: 1479
I am looking for doing shl(mult(var1,var2),1)
operation, where mult
multiplies var1
and var2
(both are 16-bit signed integers) and shl
shifts left arithmetically the multiplication result. Result must be saturated, i.e., int32 max or int32 min if overflow or underflow occurs and mult(-32768,-32768)=2147483647
.
I need to make this op for multiple values in an efficient way for which I think using MMX/SSE instruction set. I though about making mult(sign_extesion(var1)
, shl(sign_extension(var2)))
but I've just discovered no MMX mult()
saturation version exists. Do you know any other way to get it?
Upvotes: 1
Views: 296
Reputation: 213060
I think the following should work for you. There is only one potential overflow case (SHRT_MIN * SHRT_MIN
) and it handles this explicitly:
#include <limits.h>
#include <mmintrin.h>
int main(void)
{
__m64 v1 = _mm_set_pi16(0, SHRT_MAX, 0, SHRT_MIN);
__m64 v2 = _mm_set_pi16(0, SHRT_MIN, 0, SHRT_MIN);
__m64 v = _mm_madd_pi16(v1, v2); // 16 x 16 signed multiply
v = _mm_slli_pi32(v, 1); // shift left by 1 bit to get full range
__m64 vcmp = _mm_cmpeq_pi32(v, _mm_set1_pi32(INT_MIN));
// test for SHRT_MIN * SHRT_MIN overflow
v = _mm_add_pi32(v, vcmp); // and correct if needed
return 0;
}
Upvotes: 3