PhillipD
PhillipD

Reputation: 1817

Private 'for' loop for every thread in OpenMP

See edit below for my preliminary solution

Consider the following code:

#include <stdio.h>
#include <stdlib.h>
#include <omp.h>

int main(void) {

int counter = 0;
int i;

omp_set_num_threads(8); 

#pragma omp parallel
        { 
            int id = omp_get_thread_num();
            #pragma omp for private(i)
            for (i = 0; i<10; i++) {
                printf("id: %d thread: %d\n", i, id);
                #pragma omp critical // or atomic
                counter++;
            }
        }

printf("counter %d\n", counter);

return 0;
}

I define the number of threads to be 8. For each of the 8 threads I would like to have a for loop for every individual threads that increments the variable counter. However, it seems that OpenMP parallelize the for loop:

i: 0 thread: 0
i: 1 thread: 0
i: 4 thread: 2
i: 6 thread: 4
i: 2 thread: 1
i: 3 thread: 1
i: 7 thread: 5
i: 8 thread: 6
i: 5 thread: 3
i: 9 thread: 7
counter 10

Consequently, counter=10, but I want counter=80. What can I do so that every threads performs its own for loop while all threads increment counter?

The following code gives the desired result: I added another outer for loop that loops from 0 to the maximal number of threads. Inside this loop I can then declare my for loop private for each thread. Indeed, counter=80 in this case. Is this the optimal solution for this problem or is there a better one?

int main(void) {


omp_set_num_threads(8); 

int mthreads = omp_get_max_threads();

#pragma omp parallel for private(i)
    for (n=0; n<mthreads; n++) {
            int id = omp_get_thread_num();
        for (i = 0; i<10; i++) {
            printf("i: %d thread: %d\n", i, id);
            #pragma omp critical
            counter++;
        }
    }

}
printf("counter %d\n", counter);

return 0;
}

Upvotes: 1

Views: 2910

Answers (2)

Hristo Iliev
Hristo Iliev

Reputation: 74365

The solution is very simple - remove the worksharing construct for:

#pragma omp parallel
    { 
        int id = omp_get_thread_num();
        for (int i = 0; i<10; i++) {
            printf("id: %d thread: %d\n", i, id);
            #pragma omp critical // or atomic
            counter++;
        }
    }

Declaring i inside the control part of the for is part of C99 and might require that you pass the compiler an option similar to -std=c99. Otherwise you could simply declare i at the beginning of the block. Or you could declare it outside the region and make it private:

int i;

#pragma omp parallel private(i)
    { 
        int id = omp_get_thread_num();
        for (i = 0; i<10; i++) {
            printf("id: %d thread: %d\n", i, id);
            #pragma omp critical // or atomic
            counter++;
        }
    }

Since you are not using the value of counter inside the parallel region, you could also use sum reduction instead:

#pragma omp parallel reduction(+:counter)
    { 
        int id = omp_get_thread_num();
        for (int i = 0; i<10; i++) {
            printf("id: %d thread: %d\n", i, id);
            counter++;
        }
    }

Upvotes: 3

Jens Gustedt
Jens Gustedt

Reputation: 78903

OpenMp has a concept for this, reduction. To stay with your example

#pragma omp parallel for reduction(+:counter)
  for (unsigned n=0; n<mthreads; n++) {
    int id = omp_get_thread_num();
    for (unsigned i = 0; i<10; i++) {
      printf("i: %d thread: %d\n", i, id);
      counter++;
    }
  }

This has the advantage not to define a critical section around the increment. OpenMp collects the total of all the different incarnations of counter all by itself, and probably more efficiently.

This can even be formulated much simpler as

#pragma omp parallel for reduction(+:counter)
  for (unsigned i=0; i<mthreads*10; i++) {
    int id = omp_get_thread_num();
    printf("i: %d thread: %d\n", i, id);
    counter++;
  }

For some compilers you probably still have to insist with a flag such as -std=c99 that you want to declare variables within the for loop. The advantage with declaring variables as local as possible, you don't have to insist that they'd be private or things like that. And the easiest way is certainly to have OpenMp do the split of the for-loop all by itself.

Upvotes: 2

Related Questions