Reputation: 1100
So, I have roughly this code:
for (int i = 0; i != 10000; ++i) {
doAction(i);
for (int j = 0; j != 10000; ++j) {
...
}
}
And I want to parallellize it using OpenMP. As I understand, a simple collapse
won't do in this case; my attempts to use separate #pragma omp for
s have borne no fruit either. Is there a simple way to parallelize this easily or do I have to resort to calling doAction
i*j
times?
Upvotes: 0
Views: 92
Reputation: 21956
The simple way to parallelize, only use OpenMP for the outer loop.
Parallelizing stuff all the way down isn’t a good thing, because thread synchronization & task scheduling overhead. When you split a large CPU bound task into pieces for parallel execution, ideally the pieces should be as large as possible while using all available CPU cores most of the time.
P.S. If you have OpenMP 4, for the inner loop, you might want to #pragma omp simd
instead of parallel
. The outer loop should still be parallel
. This way you'll use both kinds of parallelism at the same time, the outer loop parallelized across cores, the inner loop parallelized across SIMD lanes. Theoretically, that's often the fastest way to compute stuff.
Upvotes: 1