MPI_GATHERV overwrites arrays that are not referenced in command

Question

I have a peculiar issue with MPI, where an array which is not referenced in the MPI command is being overwritten---there is some sort of memory issue going on.

In the first case of gatherv, mpi works as expected. On the second call of gatherv, information from the first array, is affected!

The code I am working in is quite large, however I have created a standalone program that roughly shows the issue.

However, in the smaller program, while there is still a problem, the program calls a seg fault, instead of just continuing on like the larger program does.


    program main

      use mpi

      integer :: chunksize, send_count, i_start, i_end
      integer, allocatable :: rec_starts(:), rec_counts(:)

      integer, parameter :: dp = 8; ! double precision

      REAL(DP), allocatable:: array_2d(:,:)
      REAL(DP), allocatable:: array_3d(:,:,:)

      INTEGER, parameter:: num_skill=5, num_pref=2

      INTEGER, parameter:: num_ed=3, num_children=2, num_age=4, num_market=28, num_health=2, num_year=2
      INTEGER, parameter:: num_total_state_m=num_children*num_market*num_year*num_ed*num_age*num_health*num_ed*num_age*num_health  

      real(dp), dimension(num_skill,num_total_state_m) :: array_2d_local
      real(dp), dimension(num_pref,num_pref,num_total_state_m) :: array_3d_local

      integer i,j,k,l,m

      !mpi vars
      integer :: ierr, ntasks, mpi_id



      ! Set up MPI
      call mpi_init(ierr)
      call mpi_comm_size(mpi_comm_world, ntasks, ierr) !get number of tasks
      call mpi_comm_rank(mpi_comm_world, mpi_id, ierr) !get id of each task
      write(*,*) 'process ', mpi_id+1, 'of ', ntasks, 'is alive,', ' mpi_id:',mpi_id

      !calculate which 'i' this thread is responsible for
            chunksize = (num_total_state_m + ntasks - 1) / ntasks !note int/int rounds down
            i_start = (mpi_id)*chunksize + 1
            i_end = min((mpi_id+1)*chunksize,num_total_state_m)

      !set up practice matrices
      allocate(array_2d(num_skill,num_total_state_m), &
           array_3d(num_pref,num_pref,num_total_state_m))

      l = 1
      m = -1
      do i=1,num_skill
         do j=1, num_total_state_m
            if (mpi_id==0) array_2d_local(i,j) = l
            if (mpi_id==1) array_2d_local(i,j) = m
            l = l + 1
            m = m - 1
         end do
      end do

      l = 1
      m = -1
      do i=1, num_pref
         do j=1, num_pref
            do k=1, num_total_state_m
               if (mpi_id==0) array_3d_local(i,j,k) = l
               if (mpi_id==1) array_3d_local(i,j,k) = m
               l = l + 1
               m = m - 1
            end do
         end do
      end do


      ! Next send matricies
      allocate(rec_starts(ntasks), rec_counts(ntasks))
      do i=1, ntasks
         rec_counts(i) = min(num_total_state_m, i * chunksize) - (i-1)*chunksize
         rec_starts(i) = (i-1) * chunksize
      end do
      rec_counts = rec_counts * num_skill
      rec_starts = rec_starts * num_skill
      send_count = rec_counts(mpi_id+1)


      ! -m  (dimensions:num_skill, num_total_state_m)  double
      call mpi_gatherv(array_2d_local(:,i_start:i_end), send_count, &
           mpi_double_precision, &
           array_2d, rec_counts, rec_starts, mpi_double_precision, &
           0, mpi_comm_world, ierr)

      ! Next do 3d array
      ! IF THESE LINES ARE UNCOMMENTED, THE PROGRAM WORKS FINE!
      !do i=1, ntasks
      !   rec_counts(i) = min(num_total_state_m, i * chunksize) - (i-1)*chunksize
      !   rec_starts(i) = (i-1) * chunksize
      !end do
      rec_counts = rec_counts * num_pref
      rec_starts = rec_starts * num_pref
      send_count = rec_counts(mpi_id+1)
      ! -array_3d    (num_pref,num_pref,num_total_state_m)double
      print*, array_2d(1,1), mpi_id, 'before'
      call mpi_gatherv(array_3d_local(:,:,i_start:i_end), send_count, &
           mpi_double_precision, &
           array_3d, rec_counts, rec_starts, mpi_double_precision, &
           0, mpi_comm_world, ierr)
      print*, array_2d(1,1), mpi_id, 'after'


      deallocate(rec_starts, rec_counts)
      deallocate(array_2d, array_3d)



    end program main

Output in this smaller program looks like this:

    mpifort -fcheck=all -fbacktrace -g -Og -ffree-line-length-2048  main.f90 -o run_main
    mpiexec -np 2 run_main 2>&1 | tee run_main.log
     process            1 of            2 is alive, mpi_id:           0
     process            2 of            2 is alive, mpi_id:           1
       1.0000000000000000                0 before
       0.0000000000000000                1 before

    Program received signal SIGSEGV: Segmentation fault - invalid memory reference.

    Backtrace for this error:
    #0  0x101e87579
    #1  0x101e86945
    #2  0x7fff6a9ecb5c

In the larger program, where the program does not segfault, the printed output looks something like this

    1.0000000000000000                0 before
    0.0000000000000000                1 before
    -1.9018063100806379               0 after
    0.0000000000000000                1 after

I have been looking at other SO posts: MPI_Recv overwrites parts of memory it should not access MPI_Recv overwrites parts of memory it should not access

but, as a non-expert in fortran/mpi, unfortunately the replies to these posts are not quite enough for me to understand the issue.

Any assistance or insight is much appreciated. Thanks!

Edit: Thanks, just me being an idiot. If someone else comes across this, triple check your recvcounts and displs!

Gilles Gouaillardet · Accepted Answer

Your initial code does

  do i=1, ntasks
     rec_counts(i) = min(num_total_state_m, i * chunksize) - (i-1)*chunksize
     rec_starts(i) = (i-1) * chunksize
  end do
  rec_counts = rec_counts * num_skill
  rec_starts = rec_starts * num_skill
  send_count = rec_counts(mpi_id+1)

and then

  rec_counts = rec_counts * num_pref
  rec_starts = rec_starts * num_pref
  send_count = rec_counts(mpi_id+1)

you simply forgot to divide by num_skill. a trivial fix is to replace the last three lines with

  rec_counts = rec_counts * num_pref / num_skill
  rec_starts = rec_starts * num_pref / num_skill
  send_count = rec_counts(mpi_id+1)

If you ever suspect a bug in the MPI library, a good practice is to try an other one (e.g. MPICH (derivative) and Open MPI). If your application crashes with both, then the odds are the bug is in your app.

MPI_GATHERV overwrites arrays that are not referenced in command

Answers (1)

Related Questions