Reputation: 33
I have the following piece of code that gives the wrong answer even if I run it on 1 thread. The code gives the correct result if the two pragmas before the for-loops are commented away. How can this be? I thought on 1 thread, there would be no difference between using OpenMP and not, except possibly some minor overhead. Also, what should I do to get a "correct behaviour"? I don't have the same problem when I have just one for-loop, but with more then 1, it doesn't work as I would think.
#include<iostream>
#include<vector>
#include<algorithm>
#include<omp.h>
using namespace std;
#pragma omp declare reduction(vec_double_plus : std::vector<double> : \
std::transform(omp_out.begin(), omp_out.end(), omp_in.begin(), omp_out.begin(), std::plus<double>())) \
initializer(omp_priv = omp_orig)
int main() {
vector<int> v;
vector<double> w(2);
for (int i = 0; i < 1000; i++) {
if (i % 2 == 0) {
v.push_back(0);
}
else {
v.push_back(1);
}
}
#pragma omp parallel for reduction(vec_double_plus:w)
for (int i = 0; i < 500; i++) {
int r = v[i];
w[r] += i;
}
#pragma omp parallel for reduction(vec_double_plus:w)
for (int i = 500; i < 1000; i++) {
int r = v[i];
w[r] += i;
}
std::cout << w[0] << std::endl;
std::cout << w[1] << std::endl;
}
Upvotes: 1
Views: 210
Reputation: 22670
The issue is, that the code assume that the original variable from the outside scope is initialized with the neutral element of the reduction - i.e. w
is full of zeros. It will create the local copies from this outside and add it again to the original copy. This even happens for a single thread.
You can change the code to initialize omp_priv
with zeroes like the following:
initializer(omp_priv = decltype(omp_orig)(omp_orig.size()))
The code looks familiar to me, so sorry about the confusion. I'll fix the original answer.
Upvotes: 1