Reputation: 4315
I want to vectorize a multiplication operation. I tried using _mm_mul_epi32
, but my CPU has only support for the "MMX, SSE (1,2,3,3S), EM64T" instruction.
Can someone please tell if I can try another function?
Upvotes: 4
Views: 351
Reputation: 212929
It depends on the range of your multiplicands - it they fit within 16 bits then there are a number of 16 x 16 bit multiple SSE instructions available prior to SSE4 (e.g. mm_madd_epi16
, mm_mulhi_epi16
, mm_mullo_epi16
, mm_mulhrs_epi16
, etc).
If you need 32 bit operands but they are unsigned then you can use mm_mul_epu32
.
Alternatively you may convert to float, and use _mm_mul_ps
(integer <-> float conversion in SSE is quite efficient, and the cost may be justified if it gets you a 4x SIMD throughput improvement).
Upvotes: 4