Lei Wang
Lei Wang

Reputation: 11

Use OpenMP section parallel in a non-parallel time-dependent do loop

I have a quick question regarding the OpenMP. Usually one can do a section parallel like this (written in fortran, and has two sections):

!$OMP PARALLEL SECTIONS

!$OMP SECTION

< Fortran code block A>

!$OMP SECTION

< Fortran code block B>

!$OMP END SECTIONS

Now what I really want to run fortran code block A and B within a do loop, which itself should not be parallelized, because this do-loop is a time-dependent loop that every new step depend on previous step’s results. And before the parallel section, I need to run a serial code (let's call it block C). Now both block A, B, C are function of do loop variable t. Then naively one might propose such code by simply embedded this parallel within a do loop:

do t=1:tmax

  < Fortran serial code block C>

  !$OMP PARALLEL SECTIONS

  !$OMP SECTION

  < Fortran code block A>

  !$OMP SECTION

  < Fortran code block B>

  !$OMP END SECTIONS

end do

However, it is obvious that the creation of the thread overheads will largely decelerate this speed, which even possibly make it slower than a standard serial code. Therefore, one might come up with smarter idea to solve this.

I was wondering whether you can help me on giving some hints on how to do this. What's the best approach (fastest computation) on this?

Upvotes: 1

Views: 459

Answers (1)

Hristo Iliev
Hristo Iliev

Reputation: 74365

I concur with both comments that it is not at all obvious how much the OpenMP overhead would be compared to the computation. If you find it (after performing the corresponding measurements) to be really high, then the typical way to handle this case is to put the loop inside a parallel region:

!$OMP PARALLEL PRIVATE(t)
do t=1,tmax

  !$OMP SINGLE
    < Fortran code block C >
  !$OMP END SINGLE

  !$OMP SECTIONS

    !$OMP SECTION
    < Fortran code block A >

    !$OMP SECTION
    < Fortran code block B >

  !$OMP END SECTIONS

end do
!$OMP END PARALLEL

Each thread will loop independently. The SECTIONS construct has an implicit barrier at its end so the threads are synchronised before the next loop iteration. If there is some additional code before the end of the parallel region that does not synchronise, an explicit barrier has to be inserted just before end do.

The SINGLE construct is used to isolate block C such that it gets executed by one thread only.

Upvotes: 1

Related Questions