user1715122
user1715122

Reputation: 967

openmp parallel sections within omp parallel for?

I have an outer for loop that I have parallelized using OpenMP. However within this for loop there are sections of code that can also be executed in parallel.

Can I use OpenMP's sections clause to parallelize this? Is this even possible? Since each iteration of the for loop is run by just one thread, can I (within each iteration), ask for certain sections of code to be run by multiple threads in parallel? Rest of the code should just be run by one thread i.e the thread to which that loop iteration has been assigned.

For ex. I have the following piece of code:

omp_p = omp_get_max_threads();
omp_set_nested(1);
#pragma omp parallel for num_threads(omp_p/2)
for(int p=0;p<omp_p/2;p++){
   size_t a = (p*N)/(omp_p/2);
   size_t b = ((p+1)*N)/(omp_p/2);
   for(int i=a;i<b;i++){
      /*Work on A[a]->A[b]*/
      for(int j=0;j<n;j++){
         for(int k=0;k<N;k++){
           /*Serial code*/
          #pragma omp parallel sections
              {
                 #pragma omp section
                   {

                   }
                 #pragma omp section
                   {

                   }

              }
           /*Serial work*/
           #pragma omp parallel sections
              {
              #pragma omp section
                   {

                   }
                 #pragma omp section
                   {

                   }
              }
           /*Serial code*/
         }
      }
   }
}

This causes the program to go much much slower than if I hadn't used the parallel sections at all..

Upvotes: 0

Views: 3697

Answers (1)

veda
veda

Reputation: 6594

Nested OMP should be possible. But I fear that you might not see any performance gain by doing this due to the following reasons:

  1. Nested OMP might result in generation of more number of threads than the number of CPU cores. This might end up in doing lots of context switching.
  2. Your OMP parallel sections are deep inside 4 nested for loops, so, there might be a possibility of overhead due to creation and destruction of threads.

Upvotes: 1

Related Questions