Reputation: 612
Both of reduction and collapse clauses in OMP confuses me, some points raised popped into my head
About the collapse.. could we apply collapse on a nested loops but have some lines of code in between for example
for (int i = 0; i < 4; i++)
{
cout << "Hi"; //This is an extra line. which breaks the 2 loops.
for (int j = 0; j < 100; j++)
{
cout << "*";
}
}
Upvotes: 2
Views: 1263
Reputation: 33679
The reduction clause requires that the operation is associative and the x = a[i] - x
operation in
for(int i=0; i<n; i++) x = a[i] - x;
is not associative. Try a few iterations.
n = 0: x = x0;
n = 1: x = a[0] - x0;
n = 2: x = a[1] - (a[0] - x0)
n = 3: x = a[2] - (a[1] - (a[0] - x0))
= a[2] - a[1] + a[0] - x0;
But x = x - a[i]
does work e.g.
n = 3: x = x0 - (a[2] + a[1] + a[0]);
However there is a workaround. The sign alternates every other term. Here is a working solution.
#include <stdio.h>
#include <omp.h>
int main(void) {
int n = 18;
float x0 = 3;
float a[n];
for(int i=0; i<n; i++) a[i] = i;
float x = x0;
for(int i=0; i<n; i++) x = a[i] - x; printf("%f\n", x);
int sign = n%2== 0 ? -1 : 1 ;
float s = -sign*x0;
#pragma omp parallel
{
float sp = 0;
int signp = 1;
#pragma omp for schedule(static)
for(int i=0; i<n; i++) sp += signp*a[i], signp *= -1;
#pragma omp for schedule(static) ordered
for(int i=0; i<omp_get_num_threads(); i++)
#pragma omp ordered
s += sign*sp, sign *= signp;
}
printf("%f\n", s);
}
Here is a simpler version which uses the reduction
clause. The thing to notice is that the odd terms are all one sign and the even terms another. So if we do the reduction two terms at a time the sign does not change and the operation is associative.
x = x0;
for(int i=0; i<n; i++) x = a[i] - x
can be reduced in parallel like this.
x = n%2 ? a[0] - x0 : x0;
#pragma omp parallel for reduction (+:x)
for(int i=0; i<n/2; i++) x += a[2*i+1+n%2] - a[2*i+n%2];
Upvotes: 0
Reputation: 2858
1 & 2. For minus, what are you subtracting from? If you have two threads, do you do result_thread_1 - result_thread_2
, or result_thread_2 - result_thread_1
? If you have more than 2 threads, then it gets even more confusing: Do I only have one negative term and all others are positive? Is there only one positive term and others are negative? Is it a mix? Which results are which? As such, no, there is no workaround.
In the event of x++
or x--
, assuming that they are within the reduction loop, they should happen to each partial result.
Yes, I believe so.
Upvotes: 2