parallel programming multiplying two arrays of numbers

Question

I have the following C++ code that multiply two array elements of a large size count

double* pA1 = { large array };
double* pA2 = { large array };
for(register int r = mm; r <= count; ++r)
{
    lg += *pA1-- * *pA2--;  
}

Is there a way that I can implement parallelism for the code?

J&#233;r&#244;me Richard · Accepted Answer

Here is an alternative OpenMP implementation that is simpler (and a bit faster on many-core platforms):

double dot_prod_parallel(double* v1, double* v2, int dim)
{
    TimeMeasureHelper helper;
    double sum = 0.;

    #pragma omp parallel for reduction(+:sum)
    for (int i = 0; i < dim; ++i)
        sum += v1[i] * v2[i];

    return sum;
}

GCC ad ICC are able to vectorize this loop in -O3. Clang 13.0 fail to do this, even with -ffast-math and even with explicit OpenMP SIMD instructions as well as a with loop tiling. This appears to be a bug of the Clang's optimizer related to OpenMP... Note that you can use -mavx to use the AVX instruction set which can be up to twice as fast as SSE (default). It is available on almost all recent x86-64 PC processors.

parallel programming multiplying two arrays of numbers

Answers (2)

Related Questions