illegal-immigrant
illegal-immigrant

Reputation: 8254

What is the way data is stored in *.npy?

I'm saving NumPy arrays using numpy.save function. I want other developers to have capability to read data from those file using C language. So I need to know,how numpy organizes binary data in file.OK, it's obvious when I'm saving array of 'i4' but what about array of arrays that contains some structures?Can't find any info in documentation

UPD : lets say tha data is something like :

dt = np.dtype([('outer','(3,)<i4'),('outer2',[('inner','(10,)<i4'),('inner2','f8')])])

UPD2 : What about saving "dynamic" data (dtype - object)

import numpy as np
a = [0,0,0]
b = [0,0]
c = [a,b]
dtype = np.dtype([('Name', '|S2'), ('objValue', object)])
data = np.zeros(3, dtype)
data[0]['objValue'] = a
data[1]['objValue'] = b
data[2]['objValue'] = c
data[0]['Name'] = 'a'
data[1]['Name'] = 'b'
data[2]['Name'] = 'c'

np.save(r'D:\in.npy', data)

Is it real to read that thing from C?

Upvotes: 34

Views: 60891

Answers (2)

kennytm
kennytm

Reputation: 523764

The npy file format is documented in numpy's NEP 1 — A Simple File Format for NumPy Arrays.

For instance, the code

>>> dt=numpy.dtype([('outer','(3,)<i4'),
...                 ('outer2',[('inner','(10,)<i4'),('inner2','f8')])])
>>> a=numpy.array([((1,2,3),((10,11,12,13,14,15,16,17,18,19),3.14)),
...                ((4,5,6),((-1,-2,-3,-4,-5,-6,-7,-8,-9,-20),6.28))],dt)
>>> numpy.save('1.npy', a)

results in the file:

93 4E 55 4D 50 59                      magic ("\x93NUMPY")
01                                     major version (1)
00                                     minor version (0)

96 00                                  HEADER_LEN (0x0096 = 150)
7B 27 64 65 73 63 72 27 
3A 20 5B 28 27 6F 75 74 
65 72 27 2C 20 27 3C 69 
34 27 2C 20 28 33 2C 29 
29 2C 20 28 27 6F 75 74 
65 72 32 27 2C 20 5B 28 
27 69 6E 6E 65 72 27 2C 
20 27 3C 69 34 27 2C 20 
28 31 30 2C 29 29 2C 20 
28 27 69 6E 6E 65 72 32                Header, describing the data structure
27 2C 20 27 3C 66 38 27                "{'descr': [('outer', '<i4', (3,)),
29 5D 29 5D 2C 20 27 66                            ('outer2', [
6F 72 74 72 61 6E 5F 6F                               ('inner', '<i4', (10,)), 
72 64 65 72 27 3A 20 46                               ('inner2', '<f8')]
61 6C 73 65 2C 20 27 73                            )],
68 61 70 65 27 3A 20 28                  'fortran_order': False,
32 2C 29 2C 20 7D 20 20                  'shape': (2,), }"
20 20 20 20 20 20 20 20 
20 20 20 20 20 0A 

01 00 00 00 02 00 00 00 03 00 00 00    (1,2,3)
0A 00 00 00 0B 00 00 00 0C 00 00 00
0D 00 00 00 0E 00 00 00 0F 00 00 00
10 00 00 00 11 00 00 00 12 00 00 00
13 00 00 00                            (10,11,12,13,14,15,16,17,18,19)
1F 85 EB 51 B8 1E 09 40                3.14

04 00 00 00 05 00 00 00 06 00 00 00    (4,5,6)
FF FF FF FF FE FF FF FF FD FF FF FF
FC FF FF FF FB FF FF FF FA FF FF FF
F9 FF FF FF F8 FF FF FF F7 FF FF FF 
EC FF FF FF                            (-1,-2,-3,-4,-5,-6,-7,-8,-9,-20)
1F 85 EB 51 B8 1E 19 40                6.28

Upvotes: 53

unutbu
unutbu

Reputation: 880887

The format is described in numpy/lib/format.py, where you can also see the Python source code used to load npy files. np.load is defined here.

Upvotes: 7

Related Questions