Blue Granny
Blue Granny

Reputation: 800

OMP 2.0 Nested For Loops

As I'm unable to use omp tasks (using visual studio 2015) I'm trying to find a workaround for a nested loop task. The code is as follows:

#pragma omp parallel
    {
        for (i = 0; i < largeNum; i++)
        {
#pragma omp single
        {
            //Some code to be run by a single thread
            memset(results, 0, num * sizeof(results[0]));
        }
#pragma omp for
            for (n = 0; n < num; n++) {
                //Call to my function
                largeFunc(params[n], &resulsts[n])
            }
        }
#pragma omp barrier
    }

I want all my threads to execute largeNum times, but wait for the memset to be set to zero, and then i want the largeFunc be performed by each thread. There are no data dependencies that I have found.

I've got what the omp directives all jumbled in my head at this point. Does this solution work? Is there a better way to do without tasks?

Thanks!

Upvotes: 3

Views: 1172

Answers (2)

Why do you want all your threads to execute largeNUM ? do you then depend on index i inside your largeFunc in someway if yes

#pragma omp parallel for
    for (int i = 0; i < largeNum; i++)
    {
#pragma omp single
    {
        //Some code to be run by a single thread
        memset(results, 0, num * sizeof(results[0]));
    }
#pragma omp barrier

// #pragma omp for  -- this is not needed since it has to be coarse on the outermost level. However if the below function does not have anything to do with the outer loop then see the next example
        for (n = 0; n < num; n++) {
            //Call to my function
            largeFunc(params[n], &resulsts[n])
        }
    }

}

If you do not depend on i then

    for (i = 0; i < largeNum; i++)
    {
        //Some code to be run by a single thread
        memset(results, 0, num * sizeof(results[0]));

 #pragma omp parallel for
        for (int n = 0; n < num; n++) {
            //Call to my function
            largeFunc(params[n], &resulsts[n])
        }
    }

However I feel you want the first one. In general you parallelise on the outermost loop. Placing pragmas in the innerloop will slow your code down due to overheads if there is not enough work to be done.

Upvotes: 1

Gilles
Gilles

Reputation: 9489

What about just this code?

#pragma omp parallel private( i, n )
for ( i = 0; i < largeNum; i++ ) {
    #pragma omp for
    for ( n = 0; n < num; n++ ) {
        results[n] = 0;
        largeFunc( param[n], &results[n] );
    }
}

As far as I understand your problem, the intialisation part should be taken care of without the need of the single directive, provided the actual type of results supports the assignment to 0. Moreover, your initial code was lacking of the private( i ) declaration. Finally, the barrier shouldn't be needed .

Upvotes: 1

Related Questions