Reputation: 235
I need to translate Matlab fread into python, in particular allowing for reading into a 2d array and skipping data while reading. I came up with the following, but I guess there may be more efficient and 'pythonic' ways to do it (I am by no means a programmer). Any suggestion? Note that I can't read the whole file and then subsample the array as the files to be read are too large.
def FromFileSkip(fid, count=1, skip=0, dtype=np.float32):
if np.ndim(count)==0:
if skip>=0:
data = np.zeros(count, dtype=dtype)
k = 0
while k<count:
data[k] = np.fromfile(fid, count=1, dtype=dtype)
fid.seek(skip, 1)
k +=1
return data
elif np.ndim(count)==1:
if skip>0:
data = np.zeros(count, dtype=dtype)
k = 0
while k<count[1]:
data[:,k] = np.fromfile(fid, count=count[0], dtype=dtype)
fid.seek(skip, 1)
k +=1
return data
else:
raise ValueError('File can be read only into 1d or 2d arrays')
Upvotes: 0
Views: 724
Reputation: 5362
This is more or less what you have, just a little bit cleaner maybe.
def fromfileskip(fid,shape,counts,skip,dtype):
"""
fid : file object, Should be open binary file.
shape : tuple of ints, This is the desired shape of each data block.
For a 2d array with xdim,ydim = 3000,2000 and xdim = fastest
dimension, then shape = (2000,3000).
counts : int, Number of times to read a data block.
skip : int, Number of bytes to skip between reads.
dtype : np.dtype object, Type of each binary element.
"""
data = np.zeros((counts,) + shape)
for c in xrange(counts):
block = np.fromfile(fid,dtype=np.float32,count=np.product(shape))
data[c] = block.reshape(shape)
fid.seek( fid.tell() + skip)
return data
Upvotes: 1