How to compute the norm of 256-bits variable using Intel AVX

Question

I'd like to compute the norm of a vector stored in a __mm256d variable.
In order to do so, I implemented the ymmnorm function saving the result is a __mm256d variable:

__m256d ymmnorm(__m256d const x)
{
    return _mm256_sqrt_pd(ymmdot(x, x));
};

exploiting the dot product function suggested here

__m256d ymmdot(__m256d const x, __m256d const y)
{
    __m256d xy = _mm256_mul_pd(x, y);
    __m256d temp = _mm256_hadd_pd(xy, xy);
    __m128d hi128 = _mm256_extractf128_pd(temp, 1);
    __m128d dotproduct = _mm_add_pd(_mm256_castpd256_pd128(temp), hi128);

    return _mm256_broadcast_pd(&dotproduct);
};

However, I am a newbie in the SIMD/AVX world. Thus, I am wondering: is there a smarter/better method to compute the vector norm of a 256-bits variable?

How to compute the norm of 256-bits variable using Intel AVX

Answers (1)

Related Questions