Reputation: 83
I am trying to write down a Fortran subroutine for my code in order to read a data from a file (which is a huge data set on itself).The data file contains the Location (nx0,ny0,nz0) and the field related to that location (Bx,By,Bz). (Ex: lets say the range for nx0, ny0 and nz0 is from [-15,15]. so the number of rows will be 31*31*31=29791)
-15.00000 -15.00000 -15.00000 700.00000 -590.00000 100.00000
-15.00000 -15.00000 -14.00000 -110.00000 -570.00000 100.00000
-15.00000 -15.00000 -13.00000 -550.00000 -200.00000 100.00000
-15.00000 -15.00000 -12.00000 -540.00000 -230.00000 100.00000
-15.00000 -15.00000 -11.00000 -140.00000 -50.00000 100.00000
. . . . . .
. . . . . .
. . . . . .
15.00000 15.00000 15.00000 140.00000 50.00000 100.000
What I want to do is to look for a specific location within my file (xi,yi and zi) and read the field related to that location then use it for further analysis. Not only the related field to the target position itself but also the surrounding field of that location (Like the three other side of the square around the target point).
subroutine read_data(xi,yi,zi,Bxij,Byij)
real*8,intent(in) :: xi,yi,zi !,time
real*8,intent(out) :: Bxij(4),Byij(4) !,Bzij(4)
integer,parameter :: step = 1 ,cols = 6, rows = 29791 !!!15,000,000
real,dimension(rows) :: nx0,ny0,nz0,Bx,By,Bz
character*15 filein
character*35 path_file
path_file = '/home/mehdi/Desktop/'
filein= 'test-0001'
open(7,file=trim(path_file)//filein, status='old',action='read')
xi_1 = xi +step
yi_1 = yi +step
do i = 1,rows
read(7,*) nx0(i),ny0(i),nz0(i),Bx(i),By(i),Bz(i)
c
if ( xi == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(1) = Bx(i)
Byij(1) = By(i)
cycle
endif
c
if ( xi == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(2) = Bx(i)
Byij(2) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi == ny0(i) .and.
& zi == nz0(i)) then
Bxij(3) = Bx(i)
Byij(3) = By(i)
cycle
endif
c
if ( xi_1 == nx0(i) .and. yi_1 == ny0(i) .and.
& zi == nz0(i)) then
Bxij(4) = Bx(i)
Byij(4) = By(i)
exit
endif
c
close(7)
enddo
end
I have done it this way but it is too slow. One of the most important things for me is the speed (which even for this small fraction of data set is really time consuming).
I know this slow mode is for the needs to read the whole data set each time in order to look for the target points. This subroutine is called couple times within the code and for the further steps the code is going to do the same thing over and over again, so it is time consuming.
How can I make this code work more efficiently?
Upvotes: 2
Views: 1041
Reputation: 8140
Before I begin this answer, let me reiterate what I said in the comments to your question:
Do not underestimate how much data you can put into a single array. Reading once, and then having everything in memory is still the fastest way possible.
But let's assume that the data really gets too big.
Your main issue seems to be that you have to re-read all the data from the beginning until you find the value you're looking for. That takes the time.
If you can calculate which line of the data file the value you are interested in is, it might help to convert the file into an unformatted direct access file.
Here is an example code for the conversion. It's using Fortran 2008 features, so if your compiler can't do it, you have to modify it:
program convert
use iso_fortran_env, only: real64
implicit none
integer, parameter :: reclength = 6*8 ! Six 8-byte values
integer :: ii, ios
integer :: u_in, u_out
real(kind=real64) :: pos(3), B(3)
open(newunit=u_in, file='data.txt', form='formatted', &
status='old', action='read', access='sequential')
open(newunit=u_out, file='data.bin', form='unformatted', &
status='new', action='write', access='direct', recl=reclength)
ii = 0
do
ii = ii + 1
read(u_in, *, iostat=ios) pos, B
if (ios /= 0) exit
write(u_out, rec=ii) pos, B
end do
close(u_out)
close(u_in)
end program convert
Once you have converted the data, you can read only the record you need, as long as you can calculate which one it is. I have assumed that just like in your example, the z
-coordinate changes fastest and the x
-coordinate changes slowest.
program read_txt
use iso_fortran_env, only: real64
implicit none
integer, parameter :: nx=601, ny=181, nz=61
real(kind=real64), parameter :: x_min=real(-nx/2, kind=real64)
real(kind=real64), parameter :: y_min=real(-ny/2, kind=real64)
real(kind=real64), parameter :: z_min=real(-nz/2, kind=real64)
real(kind=real64), parameter :: x_step = 1.0_real64
real(kind=real64), parameter :: y_step = 1.0_real64
real(kind=real64), parameter :: z_step = 1.0_real64
real(kind=real64) :: request(3), pos(3), B(3)
integer :: ios, u_in
integer :: ii, jj, kk, record
integer, parameter :: reclength = 6 * 8 ! Six 8-byte values
open(newunit=u_in, file='data.bin', access='direct', form='unformatted', &
status='old', action='read', recl=reclength)
mainloop : do
read(*, *, iostat=ios) request
if (ios /= 0) exit mainloop
write(*, '(A, 3F7.2)') 'searching for ', request
! Calculate record
ii = nint((request(1)-x_min)/x_step)
jj = nint((request(2)-y_min)/y_step)
kk = nint((request(3)-z_min)/z_step)
record = kk + jj * nz + ii * nz * ny + 1
read(u_in, rec=record, iostat=ios) pos, B
if (ios /= 0) then
print *, 'failure to read'
cycle mainloop
end if
write(*, '(2(A, 3F7.2))') "found pos: ", pos, " Bx, By, Bz: ", B
end do mainloop
close(u_in)
end program read_txt
Note that the unformatted is not compiler- and system independent. A file created on one computer or with a program compiled by one compiler might not be able to be read with another program or on another computer.
But if you have control over it, it might be a useful way to speed things up.
PS: I left the x, y, and z coordinates in the file so that you can check whether the values are actually what you wanted. Always good to verify these things.
Upvotes: 3