OpenMP code in fortran - Reduction question

Question

The code solves the following equation: A1(y,bp,kp) = \sum_i (B(y,yp_i)*C(Yp_i,Bp,Kp)*sum_j(D(bpp_j,kpp_j,yp_i,bp,kp)*A0(yp_i,bpp,kpp)))

I have the following code with multiple do-loops. The purpose is to compute a Matrix A1 based on other arrays.

    do iKp = 1 , nK
        do iBp = 1 , nB
             do iY = 1 , nY
                tempibk = 0.0_dp
                do iYp = 1 , nY
                    do iBK = 1 , nBK
                        ibpp = bpi(iYp,iBp,iKp,iBK)
                        ikpp = Kpi(iYp,iBp,iKp,iBK)
                        tempibk(iYp) = tempibk(iYp) + D(iYp,iBp,iKp,iBK)*A0(iYp,ibpp,ikpp)
                    enddo 
            A1(iY,iBp,iKp) = A1(iY,iBp,iKp) + B(iY,iYp)*C(iYp,iBp,iKp)*tempibk(iYp))
                end do
            end do
        end do
    end do

I want to parellelize the code with OpenMP. My question is related with the thread safety of this piece of code.

My atempt to parellelize it is the following:

!$OMP PARALLEL DO PRIVATE ( iY , iBp , iYp ,iKp, iBK, ibpp, ikpp, tempibk) SHARED(q1)
    do iKp = 1 , nK
        do iBp = 1 , nB
             do iY = 1 , nY
                tempibk = 0.0_dp
                do iYp = 1 , nY
                    do iBK = 1 , nBK
                        ibpp = bpi(iYp,iBp,iKp,iBK)
                        ikpp = Kpi(iYp,iBp,iKp,iBK)
                        tempibk(iYp) = tempibk(iYp) + D(iYp,iBp,iKp,iBK)*A0(iYp,ibpp,ikpp)
                    enddo 
                    !$OMP ATOMIC
            A1(iY,iBp,iKp) = A1(iY,iBp,iKp) + B(iY,iYp)*C(iYp,iBp,iKp)*tempibk(iYp))
                    !$OMP END ATOMIC
                end do
            end do
        end do
    end do
!$OMP END PARALLEL DO

My concern was that multiple threads working on A1 would create a race condition. Two threads working on different (iY, iYp, iBp, iKp) combinations might try to access and update the same element of A1 at the same time. My solution was the use of the ATOMIC directive.

Should the code inside the iBK do-loop have a critical section?

I am also wondering if tempibk needs a REDUCTION. In fact, would A1 also use a REDUCTION?

I also don't quite understand is which tempibk array does the thread with the (ikp,ibp,iy,iyp) "sees".

OpenMP code in fortran - Reduction question

Answers (1)

Related Questions