Reputation: 739
There are different blocks in a binary that I want to read using a single call of numpy.fromfile
. Each block has the following format:
OES=[
('EKEY','i4',1),
('FD1','f4',1),
('EX1','f4',1),
('EY1','f4',1),
('EXY1','f4',1),
('EA1','f4',1),
('EMJRP1','f4',1),
('EMNRP1','f4',1),
('EMAX1','f4',1),
('FD2','f4',1),
('EX2','f4',1),
('EY2','f4',1),
('EXY2','f4',1),
('EA2','f4',1),
('EMJRP2','f4',1),
('EMNRP2','f4',1),
('EMAX2','f4',1)]
Here is the format of the binary:
Data I want (OES format repeating n times)
------------------------
Useless Data
------------------------
Data I want (OES format repeating m times)
------------------------
etc..
I know the byte increment between the data i want and the useless data. I also know the size of each data block i want.
So far, i have accomplished my goal by seeking on the file object f
and then calling:
nparr = np.fromfile(f,dtype=OES,count=size)
So I have a different nparr
for each data block I want and concatenated all the numpy
arrays into one new array.
My goal is to have a single array with all the blocks i want without concatenating (for memory purposes). That is, I want to call nparr = np.fromfile(f,dtype=OES)
only once. Is there a way to accomplish this goal?
Upvotes: 0
Views: 722
Reputation: 114781
That is, I want to call
nparr = np.fromfile(f,dtype=OES)
only once. Is there a way to accomplish this goal?
No, not with a single call to fromfile()
.
But if you know the complete layout of the file in advance, you can preallocate the array, and then use fromfile
and seek
to read the OES blocks directly into the preallocated array. Suppose, for example, that you know the file positions of each OES block, and you know the number of records in each block. That is, you know:
file_positions = [position1, position2, ...]
numrecords = [n1, n2, ...]
Then you could do something like this (assuming f
is the already opened file):
total = sum(numrecords)
nparr = np.empty(total, dtype=OES)
current_index = 0
for pos, n in zip(file_positions, numrecords):
f.seek(pos)
nparr[current_index:current_index+n] = np.fromfile(f, count=n, dtype=OES)
current_index += n
Upvotes: 2