Curtis
Curtis

Reputation: 1

Fortran MPI allgatherv with derived type for 2d array

Need help with this Fortran MPI problem. Trying to gather data from different columns of 2D array. The problem is that all data from each row is not used and columns assigned per process my not be equal. All processes start with a equivalent global view of data, each process should perform work on specific columns, and finally exchange information so that all process share the common view again. Problems is similar to MPI partition and gather 2D array in Fortran and Sending 2D arrays in Fortran with MPI_Gather

Drawn example: data(8,4) using 3 MPI process

---------------------
| a1 | b1 | c1 | d1 |
| a2 | b2 | c2 | d2 |
| a3 | b3 | c3 | d3 |
| a4 | b4 | c4 | d4 |
| a5 | b5 | c5 | d5 |
| a6 | b6 | c6 | d6 |
| a7 | b7 | c7 | d7 |
| a8 | b8 | c8 | d8 |
---------------------

Process 1 will get 2 column to work, process 2 gets 1 column, process 3 gets 1 column.

-----------  ------  ------
| a1 | b1 |  | c1 |  | d1 |
| a2 | b2 |  | c2 |  | d2 |
| a3 | b3 |  | c3 |  | d3 |
| a4 | b4 |  | c4 |  | d4 |
| a5 | b5 |  | c5 |  | d5 |
| a6 | b6 |  | c6 |  | d6 |
| a7 | b7 |  | c7 |  | d7 |
| a8 | b8 |  | c8 |  | d8 |
-----------  ------  ------

In real problem, the actual size is data(200000,59). This is a preallocated chunk of memory where I am only used part of each column (always starting at index 1). For instance, I only need the first 3 values in each column.

-----------  ------  ------
| a1 | b1 |  | c1 |  | d1 |
| a2 | b2 |  | c2 |  | d2 |
| a3 | b3 |  | c3 |  | d3 |
| == | == |  | == |  | == |
| a4 | b4 |  | c4 |  | d4 |
| a5 | b5 |  | c5 |  | d5 |
| a6 | b6 |  | c6 |  | d6 |
| a7 | b7 |  | c7 |  | d7 |
| a8 | b8 |  | c8 |  | d8 |
-----------  ------  ------

I am trying to create a send and receive data type that can be used to accomplish this. My best guess so far has been to use MPI_TYPE_VECTOR. MPI_TYPE_VECTOR(COUNT, BLOCKLENGTH, STRIDE, OLDTYPE, NEWTYPE, IERROR)

For this would use MPI_TYPE_VECTOR(1, 3, 8, MPI_DOUBLE, newtype, ierr). This should allow each process to send minimal amount of information. With this, I thought I should be able to send information with ALLGATHERV.

MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE, RECVBUF, RECVCOUNT, DISPLS, RECVTYPE, COMM, IERROR) Where I use MPI_ALLGATHERV(data(1,my_first_col), num_cols_to_be_sent, newtype, data, RECVCOUNT[], DISPLS[], newtype, COMM, IERROR)

From what I can tell, this is the information that should be sent being sent for each process.

Process 1: [a1,a2,a3,b1,b2,b3]
Process 2: [c1,c2,c3]
Process 3: [d1,d2,d3]

The examples I have seen all use either the entire column of data or the displacement is naturally a multiple of the subarray needed. I cannot get it to unpack into the correct columns. Shouldn't it be able to do this since the receive end has an understanding of the type's size/extent. Granted i am very confused on the whole extent thing. Any help would be appreciated. The real code is at work, but here is a quick recreation for viewing and comments (may not compile,just made quickly).

  MODULE PARALLEL
    INTEGER iproc, nproc, rank, ierr
    INTEGER mylow, myhigh, mysize, ichunk, irem
    INTEGER, ALLOCATABLE :: isize(:), idisp(:), ilow(:), ihigh(:)
    DOUBLE PRECISION, ALLOCATABLE :: glob_val(:,:)
    INTEGER newtype
  END MODULE


  PROGRAM MAIN
  USE PARALLEL
  IMPLICIT NONE
  INCLUDE 'mpif.f'

c   **temp variables
  integer i, j
  integer num_rows,num_cols
  integer used_rows

c    ----setup MPI----
  call MPI_INIT(ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
  iproc = rank+1  !rank is base 0, rest of fortran base 1

c   ----setup initial data      
  integer num_rows=20  !contiguous in memory over rows (ie single column)
  integer num_cols=11  !noncontiguous memory between different columns
  integer 

  ALLOCATE (isize(nproc))
  ALLOCATE (idisp(nproc))
  ALLOCATE (ilow(nproc))
  ALLOCATE (ishigh(nproc))      
  ALLOCATE (glob_val(num_rows,num_cols))

  glob_val = 1.0*iproc   !sent all glob values to process id
  do i=1,num_cols
    do j=1,used_rows
      glob_val(j,i) = iproc+.01*j  !add refernce index to used data
    end do
  end do

c   ---setup exchange information
  ichunk = num_cols/nproc
  irem = num_cols -(ichunk*nproc)
  mysize=ichunk
  if(iproc.le.irem) mysize=mysize+1

  mylow=0
  myhigh=0

  do i=1,nproc   !establish global understanding of processes
    mylow=myhigh+1
    myhigh=mylow+ichunk
    if(i.le.irem) myhigh=myhigh+1

    isize(i)=myhigh-mylow+1
    idisp(i)=(mylow-1)    !based on receiving type size/extent
    ilow(i)=mylow
    ihigh(i)=myhigh
  end do
  mylow=ilow(iproc)
  myhigh=ihigh(iproc)

  call MPI_TYPE_VECTOR(1,used_rows,num_rows,MPI_DOUBLE,
 &                     newtype,ierr)
  call MPI_TYPE_COMMIT(newtype,ierr)

c   --- perform exchange based on 'newtype'      
      !MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE,
      !               RECVBUF, RECVCOUNT, DISPLS, RECVTYPE,
      !               COMM, IERROR)
  call MPI_ALLGATHERV(glob_val(1,mylow),mysize,newtype
 &                    glob_val,isize,iproc,newtype,
 &                    MPI_COMM_WORLD,ierr)      

c   ---print out global results of process 2
  if(iproc.eq.2) then      
    do i=1,num_rows
      write(*,*) (glob_val(i,j),j=1,num_cols) 
    end do
  end if

  END program

Upvotes: 0

Views: 997

Answers (1)

Dr.Tower
Dr.Tower

Reputation: 995

OK, I got this working in the following way:

1) myhigh=mylow + ichunk - 1 not myhigh = mylow + ichunk

2) used_rows has to be set before the assignment loop

3) Define the actual buffers more explicitly, try

call MPI_ALLGATHERV(glob_val(:,mylow:myhigh), mysize, newtype,   &
                    glob_val(1:used_rows,:), isize, idisp, newtype, &
                    MPI_COMM_WORLD, ierr)

full code using gfortran and openmpi:

  MODULE PARALLEL
    INTEGER iproc, nproc, rank, ierr
    INTEGER mylow, myhigh, mysize, ichunk, irem
    INTEGER, ALLOCATABLE :: isize(:), idisp(:), ilow(:), ihigh(:)
    DOUBLE PRECISION, ALLOCATABLE :: glob_val(:,:)
    INTEGER newtype
  END MODULE


  PROGRAM MAIN
  USE PARALLEL
  use mpi
  IMPLICIT NONE
  ! INCLUDE 'mpif.f'

!   **temp variables
  integer i, j
  integer num_rows,num_cols
  integer used_rows

!    ----setup MPI----
  call MPI_INIT(ierr)
  call MPI_COMM_RANK(MPI_COMM_WORLD,rank,ierr)
  call MPI_COMM_SIZE(MPI_COMM_WORLD,nproc,ierr)
  iproc = rank+1  !rank is base 0, rest of fortran base 1

!   ----setup initial data      
  num_rows=8  !contiguous in memory over rows (ie single column)
  num_cols=4  !noncontiguous memory between different columns
  used_rows = 3

  ALLOCATE (isize(nproc))
  ALLOCATE (idisp(nproc))
  ALLOCATE (ilow(nproc))
  ALLOCATE (ihigh(nproc))      
  ALLOCATE (glob_val(num_rows,num_cols))

!  glob_val = 1.0*iproc   !sent all glob values to process id
  glob_val = -1.0 * iproc  
  do i=1,num_cols
    do j=1,used_rows
      glob_val(j,i) = (1.0*iproc)+(.01*j)  !add refernce index to used data
    end do
  end do

!   ---setup exchange information
  ichunk = num_cols/nproc
  irem = num_cols -(ichunk*nproc)
  mysize=ichunk
  if(iproc.le.irem) mysize=mysize+1

  mylow=0
  myhigh=0

  do i=1,nproc   !establish global understanding of processes
    mylow=myhigh+1
    myhigh=mylow+ichunk-1
    if(i.le.irem) myhigh=myhigh+1

    isize(i)=myhigh-mylow+1
    idisp(i)=(mylow-1)    !based on receiving type size/extent
    ilow(i)=mylow
    ihigh(i)=myhigh
  end do
  mylow=ilow(iproc)
  myhigh=ihigh(iproc)

  call MPI_TYPE_VECTOR(1,used_rows,num_rows,MPI_DOUBLE, &
                      newtype,ierr)
  call MPI_TYPE_COMMIT(newtype,ierr)

  write(*,*) rank, idisp
  write(*,*) rank, isize
!   --- perform exchange based on 'newtype'      
      !MPI_ALLGATHERV(SENDBUF, SENDCOUNT, SENDTYPE,
      !               RECVBUF, RECVCOUNT, DISPLS, RECVTYPE,
      !               COMM, IERROR)
  call MPI_ALLGATHERV(glob_val(:,mylow:myhigh),mysize,newtype, &
                     glob_val(1:used_rows,:),isize,idisp,newtype, &
                     MPI_COMM_WORLD,ierr)      

!   ---print out global results of process 2
  if(iproc.eq.2) then      
    do i=1,num_rows
      write(*,*) (glob_val(i,j),j=1,num_cols) 
    end do
  end if

  call MPI_Finalize(ierr)

  END program

Upvotes: 0

Related Questions