bob.sacamento
bob.sacamento

Reputation: 6651

Usage of MPI_FILE_WRITE()

I am trying to understand some aspects of MPI I/O. The following test code is intended to populate local arrays of four processes, each local array a part of a larger 10x10 array, and then output to a file so that the entire array is written in proper order. You might notice that the four processes possess rectangular portions of the array, and together they cover the domain of the large array exactly, but their boundaries are not "squared" with each other. This is intentional.

You'll notice that where the writing actually takes place, I have two options. The first produces a file filled with some hit-or-miss correct values, but mainly gibberish. The second option works perfectly. I was expecting the first option to work as well. What am I not understanding about mpi_file_write()?

module mpi_stuff

use mpi

integer :: err_mpi
integer :: stat_mpi(MPI_STATUS_SIZE)
integer :: numprocs, myrank
integer :: output_type
integer :: outfile
integer :: starts(2)

end module mpi_stuff


module mydata

! ll: lower left x and y of local array
! uu: upper right x and y of local array
! arrsize : dimensions of local array
integer :: ll(2), uu(2), arrsize(2)
integer, allocatable :: lcl_data(:,:)

end module mydata

program output_test

    use mpi_stuff
use mydata

! init MPI.  get rank and size of comm
call mpi_init(err_mpi)
call mpi_comm_size(MPI_COMM_WORLD, numprocs, err_mpi)
call mpi_comm_rank(MPI_COMM_WORLD, myrank, err_mpi)

! initialize data
call data_init()

! define output types
print *,'proc ',myrank,' about to create'
call flush(6)
call mpi_type_create_subarray(2, (/10,10/), arrsize, starts, MPI_ORDER_FORTRAN,   &
                              MPI_INTEGER, output_type, err_mpi)
call mpi_type_commit(output_type, err_mpi)

! open file
call mpi_file_open(MPI_COMM_WORLD, 'output.mpi',  &
                   MPI_MODE_CREATE+MPI_MODE_RDWR, &
                   MPI_INFO_NULL, outfile, err_mpi)

! write to file
! option 1 -- FAILS MISERABLY!
!call mpi_file_write(outfile, lcl_data, 1, output_type, stat_mpi, err_mpi)
! option 2 -- WORKS PERFECTLY!
call mpi_file_set_view(outfile, 0, MPI_INTEGER, output_type, "native", MPI_INFO_NULL, err_mpi)
call mpi_file_write(outfile, lcl_data, arrsize(1)*arrsize(2), MPI_INTEGER, stat_mpi, err_mpi)

! clean up
call mpi_file_close(outfile, err_mpi)
call mpi_type_free(output_type, err_mpi)
call mpi_finalize(err_mpi)

end program output_test




subroutine data_init()

use mpi_stuff
use mydata

integer :: glbj, glbi, gval

select case(myrank)
  case(0)
    ll = (/1,1/)
    uu = (/4,3/)
  case(1)
    ll = (/1,4/)
    uu = (/4,10/)
  case(2)
    ll = (/5,1/)
    uu = (/10,7/)
  case(3)
    ll = (/5,8/)
    uu = (/10,10/)
end select

arrsize(1) = uu(1)-ll(1)+1
arrsize(2) = uu(2)-ll(2)+1
starts = ll - 1

print *,myrank,": ", ll, uu, starts, arrsize

allocate(lcl_data(arrsize(1), arrsize(2)))

do j = 1, arrsize(2)
  glbj = j + ll(2) - 1
  do i = 1, arrsize(1)
    glbi = i + ll(1) - 1 
    gval = (glbi-1) + 10*(glbj-1)
    lcl_data(i,j) = gval
  enddo
enddo

print *,myrank,': ',lcl_data

end subroutine data_init

Upvotes: 1

Views: 961

Answers (1)

David Henty
David Henty

Reputation: 1764

I think of writing in MPI-IO as if the write call is a send operation, and you then execute a receive into the file using the filetype as the datatype on the receive side.

In the first incantation you are not telling MPI where to put the data into the file - it needs to know this as the data from each process is non-contiguous at the receive side (the file) but contiguous at the send side. You are applying the subarray type at the send side so you are sending random data as this will access outside the bounds of lcl_data. As you have not specified a file type then it must use some default at the receive side (the file). Whatever that default is it can't work as you're not sending the right data.

The second incantation is 100% correct. Each process sends all its local data as a contiguous block. Now your subarray is applied at the receive side, i.e. the data from each process is unpacked into the correct section of the receive buffer (the file). The only slight worry here is you specify a hard "0" for the disp in "set_view". This may be converted to the correct type (MPI_OFFSET_KIND) by the interface, but I have used systems where you have to pass a variable "disp": INTEGER(KIND=MPI_OFFSET_KIND) disp=0 to make sure you get a 64-bit zero (not a default 32-bit value).

For performance, you should use MPI_File_Write_all which can increase write speeds by orders of magnitude on very large files / large process counts.

Upvotes: 3

Related Questions