Reputation: 877
I am reading a file containing single precision data with 512**3 data points. Based on a threshold, I assign each point a flag of 1 or 0. I wrote two programs doing the same thing, one in fortran, the other in python. But the one in fortran takes like 0.1 sec while the one in python takes minutes. Is it normal? Or can you please point out the problem with my python program:
fortran.f
program vorticity_tracking
implicit none
integer, parameter :: length = 512**3
integer, parameter :: threshold = 1320.0
character(255) :: filen
real, dimension(length) :: stored_data
integer, dimension(length) :: flag
integer index
filen = "vor.dat"
print *, "Reading the file ", trim(filen)
open(10, file=trim(filen),form="unformatted",
& access="direct", recl = length*4)
read (10, rec=1) stored_data
close(10)
do index = 1, length
if (stored_data(index).ge.threshold) then
flag(index) = 1
else
flag(index) = 0
end if
end do
stop
end program
Python file:
#!/usr/bin/env python
import struct
import numpy as np
f_type = 'float32'
length = 512**3
threshold = 1320.0
file = 'vor_00000_455.float'
f = open(file,'rb')
data = np.fromfile(f, dtype=f_type, count=-1)
f.close()
flag = []
for index in range(length):
if (data[index] >= threshold):
flag.append(1)
else:
flag.append(0)
********* Edit ******
Thanks for your comments. I am not sure then how to do this in fortran. I tried the following but this is still as slow.
flag = np.ndarray(length, dtype=np.bool)
for index in range(length):
if (data[index] >= threshold):
flag[index] = 1
else:
flag[index] = 0
Can anyone please show me?
Upvotes: 0
Views: 1796
Reputation: 182789
Your two programs are totally different. Your Python code repeatedly changes the size of a structure. Your Fortran code does not. You're not comparing two languages, you're comparing two algorithms and one of them is obviously inferior.
Upvotes: 6
Reputation: 350
In general Python is an interpreted language while Fortran is a compiled one. Therefore you have some overhead in Python. But it shouldn't take that long.
One thing that can be improved in the python version is to replace the for loop by an index operation.
#create flag filled with zeros with same shape as data
flag=numpy.zeros(data.shape)
#get bool array stating where data>=threshold
barray=data>=threshold
#everywhere where barray==True put a 1 in flag
flag[barray]=1
shorter version:
#create flag filled with zeros with same shape as data
flag=numpy.zeros(data.shape)
#combine the two operations without temporary barray
flag[data>=threshold]=1
Upvotes: 3
Reputation: 1152
Try this for python:
flag = data > threshhold
It will give you an array of flags as you want.
Upvotes: 1