Using numpy fromfile on binary file returns 1 dimension ndarray

Question

I'm using numpy's fromfile function to read data from a binary file. The file contains a sequence of values (3 * float32, 3 * int8, 3 * float32) which I want to extract into a numpy ndarray with (rows, 9) shape.

with open('file/path', 'rb') as my_file:
    my_dtype = np.dtype('>f4, >f4, >f4, >i1, >i1, >i1, >f4, >f4, >f4' )
    my_array = np.fromfile( my_file, dtype = my_dtype )

    print(my_array.shape)
    print(type(my_array[0]))
    print(my_array[0])

And this returns:

(38475732,)

(-775.0602416992188, -71.0, -242.5240020751953, 39, 39, 39, 5.0, 2753.0, 15328.0)

How can I get a 2 dimensional ndarray with shape (38475732, 9,)?
Why the returned tuple is of type 'numpy.void'?

Redefining question:

If all the values that I want to read from the file were, for example, 4 byte floats I would use np.dtype('9>f4') and I would get what I need. But, as my binary file contains different types, is there a way of casting all the values into 32bit floats?

PS: I can do this using 'struct' to parse the binary file into a list and converting this list into an ndarray afterwards, but this method is much slower than using np.fromfile

Solution:

Thanks Hpaulj for your answer! What I did in my code was to add the following line to do the conversion from the recarray returned by the numpy fromfile function to the expected ndarray:

my_array = my_array.astype('f4, f4, f4, f4, f4, f4, f4, f4, f4').view(dtype='f4').reshape(my_array.shape[0], 9)

Which returns a (38475732, 9) ndarray

Cheers!

hpaulj · Accepted Answer

What is my_array[[0]]? my_array is a 1d array of records defined by my_dtype.

my_array[0] is one of those records, a tuple. Notice that some entries are float, some integers. If it was a row of a 2d array, all entries would be of the same type (e.g. float).

To convert it to a 2d array of floats, you might try:

np.array(my_array.tolist())

Another way is to convert all the fields to the same type, and reshape it. Something along this line (tested on a different recarray):

x = array([(1.0, 2), (3.0, 4)], dtype=[('x', '



See also: How to convert numpy.recarray to numpy.array?

Using numpy fromfile on binary file returns 1 dimension ndarray

Answers (2)

Related Questions