user1829358
user1829358

Reputation: 1091

Multiply two vectors of 32bit integers, producing a vector of 32bit result elements

What is the best way to multiply each 32bit entry of two _mm256i registers with each other?

_mm256_mul_epu32 is not what I'm looking for because it produces 64bit outputs. I want a 32bit result for every 32bit input element.

Moreover, I'm sure that the multiplication of two 32bit values will not overflow.

Thanks!

Upvotes: 4

Views: 2480

Answers (1)

Jason R
Jason R

Reputation: 11758

You want the _mm256_mullo_epi32() intrinsic. From Intel's excellent online intrinsics guide:

Synopsis

__m256i _mm256_mullo_epi32 (__m256i a, __m256i b)
#include "immintrin.h" 
Instruction: vpmulld ymm, ymm, ymm CPUID Flags: AVX2 

Description

Multiply the packed 32-bit integers in a and b, producing intermediate 64-bit integers, and store the low 32 bits of the intermediate integers in dst.

Upvotes: 7

Related Questions