Understanding #pragma omp parallel

I am reading about OpenMP and it sounds amazing. I came at point where the author states that #pragma omp parallel can be used to create a new team of threads. So I wanted to know what difference does #pragma omp parallel mean here. I read that #pragma omp for uses the current team of threads to process a for loop.So I have two examples

First simple example:

 #pragma omp for
 for(int n=0; n<10; ++n)
 {
   printf(" %d", n);
 }
 printf(".\n");

Second example

 #pragma omp parallel
 {
  #pragma omp for
  for(int n=0; n<10; ++n) printf(" %d", n);
 }
 printf(".\n");

My question is are those thread created on the fly every time or once when an application starts also when or why would I want to create a team of more threads ?

Upvotes: 2

Answers (3)

dreamcrash

Reputation: 51623

TL;DR: The only difference is that the first code calls two implicit barriers whereas the second calls only one.

A more detail answer using as reference the modern official OpenMP 5.1 standard.

The

#pragma omp parallel:

will create a parallel region with a team of threads, where each thread will execute the entire block of code that the parallel region encloses.

From the OpenMP 5.1 one can read a more formal description :

When a thread encounters a parallel construct, a team of threads is created to execute the parallel region (..). The thread that encountered the parallel construct becomes the primary thread of the new team, with a thread number of zero for the duration of the new parallel region. All threads in the new team, including the primary thread, execute the region. Once the team is created, the number of threads in the team remains constant for the duration of that parallel region.

The:

#pragma omp parallel for

will create a parallel region (as described before), and to the threads of that region the iterations of the loop that it encloses will be assigned, using the default chunk size, and the default schedule which is typically static. Bear in mind, however, that the default schedule might differ among different concrete implementation of the OpenMP standard.

From the OpenMP 5.1 you can read a more formal description :

The worksharing-loop construct specifies that the iterations of one or more associated loops will be executed in parallel by threads in the team in the context of their implicit tasks. The iterations are distributed across threads that already exist in the team that is executing the parallel region to which the worksharing-loop region binds.

Moreover,

The parallel loop construct is a shortcut for specifying a parallel construct containing a loop construct with one or more associated loops and no other statements.

Or informally, #pragma omp parallel for is a combination of the constructor #pragma omp parallel with #pragma omp for.

Both versions that you have with a chunk_size=1 and a static schedule would result in something like:

Code-wise the loop would be transformed to something logically similar to:

for(int i=omp_get_thread_num(); i < n; i+=omp_get_num_threads())
{  
    //...
}

where omp_get_thread_num()

The omp_get_thread_num routine returns the thread number, within the current team, of the calling thread.

and omp_get_num_threads()

Returns the number of threads in the current team. In a sequential section of the program omp_get_num_threads returns 1.

or in other words, for(int i = THREAD_ID; i < n; i += TOTAL_THREADS). With THREAD_ID ranging from 0 to TOTAL_THREADS - 1, and TOTAL_THREADS representing the total number of threads of the team created on the parallel region.

Upvotes: 4

pTz

Reputation: 144

Your first example wouldn't compile like that. The "#pragma omp for" advises the compiler to distribute the work load of the following loop within the team of threads which you have to create first. A team of threads is created with the "#pragma omp parallel" statement as you use it in the second example. You can combine the "omp parallel" and "omp for" directives by using "#pragma omp parallel for" The team of threads are created after the parallel statement and are valid within this block.

Upvotes: 5

alexbuisson

Reputation: 8509

À "parallel" region can contains more than a simple "for" loop. At the 1st time your program meet "parallel" the open MP thread team will be create, after that, every open mp construct will reuse those thread for loop, section, task, etc.....

Upvotes: 0

Understanding #pragma omp parallel

Answers (3)

Related Questions