Reputation: 1156
I want to write parallel code using openmp and reduction for square addition of matrix(X*X) values. Can I use "2 for loops" after #pragma omp parallel for reduction. if not kindly suggest.
#pragma omp parallel
{
#pragma omp parallel for reduction(+:SqSumLocal)
for(index=0; index<X; index++)
{
for(i=0; i<X; i++)
{
SqSumLocal = SqSumLocal + pow(InputBuffer[index][i],2);
}
}
}
Solution: Adding int i
under #pragma omp parallel
solves the problem.
Upvotes: 0
Views: 3017
Reputation: 1474
The way you've written it is correct, but not ideal: only the outer loop will be parallelized, and each of the inner loops will be executed on individual threads. If X
is large enough (significantly larger than the number of threads) this may be fine. If you want to parallelize both loops, then you should add a collapse(2)
clause to the directive. This tells the compiler to merge the two loops into a single loop and execute the whole thing in parallel.
Consider an example where you have 8 threads, and X=4. Without the collapse
clause, only four threads will do work: each one will complete the work for one value of index
. With the collapse
clause, all 8 threads will each do half as much work. (Of course, parallelizing such a trivial amount of work is pointless - this is just an example.)
Upvotes: 2