Declare Reduction over Vector, Running on 1 thread gives different result than without openmp

Question

I have the following piece of code that gives the wrong answer even if I run it on 1 thread. The code gives the correct result if the two pragmas before the for-loops are commented away. How can this be? I thought on 1 thread, there would be no difference between using OpenMP and not, except possibly some minor overhead. Also, what should I do to get a "correct behaviour"? I don't have the same problem when I have just one for-loop, but with more then 1, it doesn't work as I would think.

#include
#include
#include
#include
using namespace std;
#pragma omp declare reduction(vec_double_plus : std::vector : \
                              std::transform(omp_out.begin(), omp_out.end(), omp_in.begin(), omp_out.begin(), std::plus())) \
                    initializer(omp_priv = omp_orig)

int main() {
    vector v;
    vector w(2);
    for (int i = 0; i < 1000; i++) {
        if (i % 2 == 0) {
            v.push_back(0);
        }
        else {
            v.push_back(1);
        }
    }
    #pragma omp parallel for reduction(vec_double_plus:w)
    for (int i = 0; i < 500; i++) {
        int r = v[i];
        w[r] += i;
    }
    #pragma omp parallel for reduction(vec_double_plus:w)
    for (int i = 500; i < 1000; i++) {
        int r = v[i];
        w[r] += i;
    }
    std::cout << w[0] << std::endl;
    std::cout << w[1] << std::endl;
}

Zulan · Accepted Answer

The issue is, that the code assume that the original variable from the outside scope is initialized with the neutral element of the reduction - i.e. w is full of zeros. It will create the local copies from this outside and add it again to the original copy. This even happens for a single thread.

You can change the code to initialize omp_priv with zeroes like the following:

initializer(omp_priv = decltype(omp_orig)(omp_orig.size()))

The code looks familiar to me, so sorry about the confusion. I'll fix the original answer.

Declare Reduction over Vector, Running on 1 thread gives different result than without openmp

Answers (1)

Related Questions