OpenMP: how to protect an array from race condition

Question

This is a follow up to question 36182486, 41421437 and several others. I want to speed up the assembly of skewness and mass matrices for a FEM calculation by using multiple processors to deal with individual elements in parallel. This little MWE shows the guts of the operation.

!! compile with gfortran -fopenmp -o FEMassembly FEMassembly.f90
Program FEMassembly
  use, intrinsic :: iso_c_binding
  implicit none
  real (c_double) :: arrayM(3,3)=reshape((/2.d0,1.d0,1.d0,1.d0,&
       &2.d0,1.d0,1.d0,1.d0,2.d0/),(/3,3/)) ! contrib from one element
  integer (c_int) :: ke,ne=4,kx,nx=6,nodes(3)
  real (c_double) :: L(6,6)
  integer (c_int) :: t(4,3)=reshape((/1,2,5,6,2,3,4,5,4,5,2,3/),(/4,3/))
  !!   first, no OMP

  do ke=1,ne ! for each triangular element
     nodes=t(ke,:)
     L(nodes,nodes)=L(nodes,nodes)+arrayM  
  end do
  print *,'L  no OMP'
  write(*,fmt="(6(1x,f3.0))")(L(kx,1:6),kx=1,nx)
  L=0

  !$omp parallel do private (nodes)
  do ke=1,ne ! for each triangular element
     nodes=t(ke,:)
!!     !$omp atomic
     L(nodes,nodes)=L(nodes,nodes)+arrayM  
!!     !$omp end atomic
  end do
  !$omp end parallel do
  print *,'L  with OMP and race'
  write(*,fmt="(6(1x,f3.0))")(L(kx,1:6),kx=1,nx)        
End Program FEMassembly

With the atomic directives commented out, the array L contains several wrong values, presumably because of the race condition I was trying to avoid with the atomic directives. The results are:

L  no OMP
2.  1.  0.  1.  0.  0.
1.  6.  1.  2.  2.  0.
0.  1.  4.  0.  2.  1.
1.  2.  0.  4.  1.  0.
0.  2.  2.  1.  6.  1.
0.  0.  1. -0.  1.  2.
L  with OMP and race
2.  1.  0.  1.  0.  0.
1.  6.  1.  2.  2.  0.
0.  1.  2.  0.  2.  1.
1.  2.  0.  4.  1.  0.
0.  2.  2.  1.  6.  1.
0.  0.  1.  0.  1.  2.

If the "atomic" directives are uncommented, the compiler return the error: Error: !$OMP ATOMIC statement must set a scalar variable of intrinsic type at (1) where (1) points to arrayM in the line L(nodes,nodes).....

What I am hoping to achieve is have the time consuming contributions from each element (here the trivial arrayM) happen in parallel, but since several threads address the same matrix element, something has to be done to have the sum occur in an orderly fashion. Can anyone suggest a way to do this?

Ian Bush · Accepted Answer

In Fortran the simplest way is to use a reduction. This is because OpenMP for Fortran supports reductions on arrays. Below is what I think you are trying to do, but take it with a pinch of salt because

You don't provide the correct output so it's difficult to test

With such a small array sometimes race conditions are difficult to find

!! compile with gfortran -fopenmp -o FEMassembly FEMassembly.f90
Program FEMassembly
  use, intrinsic :: iso_c_binding
   Use omp_lib, Only : omp_get_num_threads
  implicit none
  real (c_double) :: arrayM(3,3)=reshape((/2.d0,1.d0,1.d0,1.d0,&
       &2.d0,1.d0,1.d0,1.d0,2.d0/),(/3,3/)) ! contrib from one element
  integer (c_int) :: ke,ne=4,nodes(3)
  real (c_double) :: L(6,6)
  integer (c_int) :: t(4,3)=reshape((/1,2,5,6,2,3,4,5,4,5,2,3/),(/4,3/))
  ! Not declared in original program
  Integer :: nx, kx
  ! Not set in original program
  nx = Size( L, Dim = 1 )

  !$omp parallel default( none ) private ( ke, nodes ) shared( ne, t, L, arrayM )
  !$omp single
  Write( *, * ) 'Working on ', omp_get_num_threads(), ' threads'
  !$omp end single
  !$omp do reduction( +:L )
  do ke=1,ne ! for each triangular element
     nodes=t(ke,:)
     L(nodes,nodes)=L(nodes,nodes)+arrayM  
 end do
 !$omp end do
 !$omp end parallel
 write(*,fmt="(6(1x,f3.0))")(L(kx,1:6),kx=1,nx)        
End Program FEMassembly

OpenMP: how to protect an array from race condition

Answers (1)

Related Questions