GerardoBelic
GerardoBelic

Reputation: 25

Parallelizing for-loop inside another loop efficiently with OpenMP

I have a problem writing the parallel instructions for a code that work like this:

// every iteration depends on the previous one
for (int iter = 0; iter < numIters; ++i)
{
    #pragma omp parallel for num_threads(numThreads)
    for (int p = 0; p < numParticles; ++p)
    {
        p_velocity_calculation(...);
    }

    // implicit sync barrier

    #pragma omp parallel for num_threads(numThreads)
    for (int p = 0; p < numParticles; ++p)
    {
        p_position_calculation(...);
    }
}

The program is about a n-body simulation where first I need to calculate the velocities and then the positions of a set of particles, hence the separation of the two for-loops.

The code runs as expected, but from what I have inquired, the thread pools created by the #pragma omp directives are created and destroyed every iteration of the outer for-loop, but I don't want to waste resources creating them.

So my question is how can I reuse those thread pools and not creating/destroying the threads every iteration?

Upvotes: 0

Views: 105

Answers (1)

Victor Eijkhout
Victor Eijkhout

Reputation: 5794

First of all: the thread pools are not destroyed, only suspended.

Next: Have you timed this and found that creating the threads is a limiting factor in your application? If not, don't worry.

Or to put it constructively : I have timed it and unless you have an extremely short omp parallel for and you call it tens of thousand of times, the overhead is negligible.

But if you are really worried, put the omp parallel outside the time loop, and do an omp for around the particle loop. You will do some redundant work between the for loops, which you can either accept or put a omp master around if it affects global variables.

But really: I wouldn't worry.

Upvotes: 2

Related Questions