Seemanth M
Seemanth M

Reputation: 17

Reading an unformatted stream binary file using MPI I/O in Fortran

I have an unformatted stream binary file of size ~60GB, which I read as follows in my serial code;

   parameter(nsea=120445)
   real*4 p(nsea,nsea)
   open(10,file='my_file.grd'  &
   & ,status='old',access='stream',form='unformatted')
   read(10)((p(ibk,jbk),jbk=1,nsea),ibk=1,nsea)
   close(10)

As it takes lots of time to read this file, I want to parallelize this part of the code using MPI I/O. I am trying to do this using mpi_file_set_view and mpi_file_read options. Can someone guide me to do it efficiently? After reading and storing the parameter p(nsea,nsea) I want to pass this whole array for some matrix arithmetic in the rest of the code.

Upvotes: 0

Views: 496

Answers (1)

johncampbell
johncampbell

Reputation: 203

I tried a simple reproducer of your sample code on my pc with 32 gb memory for a 27 Gb test file. Compiling with "gFortran stream.f90 -o stream.exe", it ran in 320 seconds, so is single thread too slow ? (There can be problems with this testing approach, due to disk buffering, but should not be a problem when file size + memory demand exceeds installed memory) You may wish to expand on my approach listed below, by including more error tests (stat=, iostat=).

! Program to test stream I/O for large file
!
   integer*8 :: nsea = 85000 ! 40000 > 6gb ! 85000 > 27gb ! 120445 > 54 gb
   real*4, allocatable :: p(:,:)
!
   integer*8 :: ibk,jbk, nij
   integer*4 :: i,j, stat
   real*4    :: err, gb
   real*8    :: sec, start
!
! Create file
    call Elapse_Time (start)
    call Delta_Time (sec)
   allocate ( p(nsea,nsea), stat=stat )
   nij = size( p, kind=8 )
   gb  = nij ; gb = gb * 4. / (1024.**3)
   write (*,21) 'size p = ',size( p, kind=8 ), gb,' Gb : stat= ',stat
!
   forall (i=1:nsea,j=1:nsea) P(i,j) = i+j
    call Delta_Time (sec)
    write (*,22) sec,' initialised'
!
   open (unit=11, file='test_file.grd', iostat=stat, &
         status='unknown', access='stream', form='unformatted')
    call Delta_Time (sec)
    write (*,22) sec, ' open : iostat= ',stat
!
   write (11,iostat=stat) (( p(ibk,jbk), ibk=1,nsea), jbk=1,nsea)
   close (11)
   deallocate (p)
    call Delta_Time (sec)
    write (*,22) sec, ' written : iostat= ',stat
!
! Read file
   allocate ( p(nsea,nsea), stat=stat )
   write (*,21) 'size p = ',size( p, kind=8 ), gb,' Gb : stat= ',stat
!
   open (unit=12, file='test_file.grd', iostat=stat,  &
         status='unknown', access='stream', form='unformatted')
    call Delta_Time (sec)
    write (*,22) sec, ' open : iostat= ',stat
!
   read (12,iostat=stat) (( p(ibk,jbk), ibk=1,nsea), jbk=1,nsea)
   close (12)
    call Delta_Time (sec)
    write (*,22) sec,' read : iostat= ',stat
!
   err = 0
   do j=1,nsea
     do i=1,nsea
       err = max ( err, P(i,j)-i-j )
     end do
   end do
   deallocate (p)
    call Delta_Time (sec)
    write (*,23) sec, ' err= ', err
!
   call Elapse_Time (sec)
   write (*,22) sec-start, ' Completed '
!
 21 format (a,i0,' : ',f0.2,a,i0)
 22 format (f9.3,a,i0)
 23 format (f9.3,a,es10.2)
  end
!
  subroutine Elapse_Time (sec)
    real*8    sec
    integer*8 clock, rate
!
    call System_Clock ( clock, rate )
    sec = dble (clock) / dble (rate)
  end subroutine Elapse_Time
!
  subroutine Delta_Time (dt)
    real*8 :: dt, sec
    real*8 :: last = 0
!
    call Elapse_Time (sec)
    dt   = sec - last
    last = sec
  end subroutine Delta_Time

I tested this code on 3 pc's, 2 with SSD. i5 has Win 7 8Gb memory, while i7 have Win 10 32Gb memory. The following table lists the performance I obtained. The file sizes are ~85% memory so disk buffering should not be significant. Results can be variable. I am surprised by the SSD performance being less than expected, while the i7 HDD performance were better than expected ?

Disk           File Gb write sec  read sec write Mb/s  read Mb/s
i5-2300  C:HDD     4.6      42.1      58.9      108.3       77.5
i7-4790K E:HDD    26.9     175.0     168.7      153.8      159.6
i7-4790K C:SSD    26.9     154.7     106.4      174.0      252.9
i7-8700K E:HDD    26.9     141.8     132.7      189.8      202.8
i7-8700K C:SSD    26.9     116.9      97.5      230.2      276.0

I hope these results give an indication of what serial stream I/O can achieve. Not sure what MPI I/O rates could be expected in these circumstances ?

Upvotes: 0

Related Questions