Clinton Winant
Clinton Winant

Reputation: 734

OpenMP: how to protect an array from race condition

This is a follow up to question 36182486, 41421437 and several others. I want to speed up the assembly of skewness and mass matrices for a FEM calculation by using multiple processors to deal with individual elements in parallel. This little MWE shows the guts of the operation.

!! compile with gfortran -fopenmp -o FEMassembly FEMassembly.f90
Program FEMassembly
  use, intrinsic :: iso_c_binding
  implicit none
  real (c_double) :: arrayM(3,3)=reshape((/2.d0,1.d0,1.d0,1.d0,&
       &2.d0,1.d0,1.d0,1.d0,2.d0/),(/3,3/)) ! contrib from one element
  integer (c_int) :: ke,ne=4,kx,nx=6,nodes(3)
  real (c_double) :: L(6,6)
  integer (c_int) :: t(4,3)=reshape((/1,2,5,6,2,3,4,5,4,5,2,3/),(/4,3/))
  !!   first, no OMP

  do ke=1,ne ! for each triangular element
     nodes=t(ke,:)
     L(nodes,nodes)=L(nodes,nodes)+arrayM  
  end do
  print *,'L  no OMP'
  write(*,fmt="(6(1x,f3.0))")(L(kx,1:6),kx=1,nx)
  L=0

  !$omp parallel do private (nodes)
  do ke=1,ne ! for each triangular element
     nodes=t(ke,:)
!!     !$omp atomic
     L(nodes,nodes)=L(nodes,nodes)+arrayM  
!!     !$omp end atomic
  end do
  !$omp end parallel do
  print *,'L  with OMP and race'
  write(*,fmt="(6(1x,f3.0))")(L(kx,1:6),kx=1,nx)        
End Program FEMassembly

With the atomic directives commented out, the array L contains several wrong values, presumably because of the race condition I was trying to avoid with the atomic directives. The results are:

L  no OMP
2.  1.  0.  1.  0.  0.
1.  6.  1.  2.  2.  0.
0.  1.  4.  0.  2.  1.
1.  2.  0.  4.  1.  0.
0.  2.  2.  1.  6.  1.
0.  0.  1. -0.  1.  2.
L  with OMP and race
2.  1.  0.  1.  0.  0.
1.  6.  1.  2.  2.  0.
0.  1.  2.  0.  2.  1.
1.  2.  0.  4.  1.  0.
0.  2.  2.  1.  6.  1.
0.  0.  1.  0.  1.  2.

If the "atomic" directives are uncommented, the compiler return the error: Error: !$OMP ATOMIC statement must set a scalar variable of intrinsic type at (1) where (1) points to arrayM in the line L(nodes,nodes).....

What I am hoping to achieve is have the time consuming contributions from each element (here the trivial arrayM) happen in parallel, but since several threads address the same matrix element, something has to be done to have the sum occur in an orderly fashion. Can anyone suggest a way to do this?

Upvotes: 1

Views: 675

Answers (1)

Ian Bush
Ian Bush

Reputation: 7433

In Fortran the simplest way is to use a reduction. This is because OpenMP for Fortran supports reductions on arrays. Below is what I think you are trying to do, but take it with a pinch of salt because

  1. You don't provide the correct output so it's difficult to test
  2. With such a small array sometimes race conditions are difficult to find

    !! compile with gfortran -fopenmp -o FEMassembly FEMassembly.f90
    Program FEMassembly
      use, intrinsic :: iso_c_binding
       Use omp_lib, Only : omp_get_num_threads
      implicit none
      real (c_double) :: arrayM(3,3)=reshape((/2.d0,1.d0,1.d0,1.d0,&
           &2.d0,1.d0,1.d0,1.d0,2.d0/),(/3,3/)) ! contrib from one element
      integer (c_int) :: ke,ne=4,nodes(3)
      real (c_double) :: L(6,6)
      integer (c_int) :: t(4,3)=reshape((/1,2,5,6,2,3,4,5,4,5,2,3/),(/4,3/))
      ! Not declared in original program
      Integer :: nx, kx
      ! Not set in original program
      nx = Size( L, Dim = 1 )
    
      !$omp parallel default( none ) private ( ke, nodes ) shared( ne, t, L, arrayM )
      !$omp single
      Write( *, * ) 'Working on ', omp_get_num_threads(), ' threads'
      !$omp end single
      !$omp do reduction( +:L )
      do ke=1,ne ! for each triangular element
         nodes=t(ke,:)
         L(nodes,nodes)=L(nodes,nodes)+arrayM  
     end do
     !$omp end do
     !$omp end parallel
     write(*,fmt="(6(1x,f3.0))")(L(kx,1:6),kx=1,nx)        
    End Program FEMassembly
    

Upvotes: 2

Related Questions