Tar9etPractice
Tar9etPractice

Reputation: 55

Python HDF5 Attributes

I'm trying to save measurement attributes in an HDF5 file. I spend a lot of time working with files made with formatting where there appears to be a group of attributes with different datatypes inside of a single attribute entry.

For example, for my file, the command

f = h5py.File('test.data','r+')
f['Measurement/Surface'].attrs['X Converter']

produces

array([(b'LateralCat', b'Pixels', array([0.        , 2.00097752, 0.        , 0.        ]))],
      dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])

Here, the first two entries are strings, and the third is an array. Now if I try to save the values to a different file:

f1 = h5py.File('test_output.data','r+')
f1['Measurement/Surface'].attrs.create('X Converter',[(b'LateralCat', b'Pixels', np.array([0.        , 2.00097752, 0.        , 0.        ]))])

I get this error:

Traceback (most recent call last): File "<pyshell#94>", line 1, in f1['Measurement/Surface'].attrs.create('X Converter',[(b'LateralCat', b'Pixels', np.array([0. , 2.00097752, 0. , 0. ]))]) File "C:\WinPython\WinPython-64bit-3.6.3.0Zero\python-3.6.3.amd64\lib\site-packages\h5py_hl\attrs.py", line 171, in create htype = h5t.py_create(original_dtype, logical=True) File "h5py\h5t.pyx", line 1611, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1633, in h5py.h5t.py_create File "h5py\h5t.pyx", line 1688, in h5py.h5t.py_create TypeError: Object dtype dtype('O') has no native HDF5 equivalent

What am I missing?

Upvotes: 0

Views: 1110

Answers (1)

hpaulj
hpaulj

Reputation: 231510

You aren't saving the same thing. The dtype of the original is significant.

In [101]: [(b'LateralCat', b'Pixels', np.array([0.        , 2.00097752, 0.        ,
     ...:  0.        ]))]
Out[101]: 
[(b'LateralCat',
  b'Pixels',
  array([0.        , 2.00097752, 0.        , 0.        ]))]
In [102]: np.array(_)
<ipython-input-102-7a2cd91c32ca>:1: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray
  np.array(_)
Out[102]: 
array([[b'LateralCat', b'Pixels',
        array([0.        , 2.00097752, 0.        , 0.        ])]],
      dtype=object)

In [104]: np.array([(b'LateralCat', b'Pixels', np.array([0.        , 2.00097752, 0.
     ...:         , 0.        ]))],
     ...:       dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])
Out[104]: 
array([(b'LateralCat', b'Pixels', array([0.        , 2.00097752, 0.        , 0.        ]))],
      dtype=[('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])

In [105]: x = _
In [106]: x.dtype
Out[106]: dtype([('Category', 'O'), ('BaseUnit', 'O'), ('Parameters', 'O')])

In [108]: x['Category']
Out[108]: array([b'LateralCat'], dtype=object)
In [109]: x['BaseUnit']
Out[109]: array([b'Pixels'], dtype=object)
In [110]: x['Parameters']
Out[110]: 
array([array([0.        , 2.00097752, 0.        , 0.        ])],
      dtype=object)

Though that doesn't quite solve it, since the dtype still contains object dtype fields.

In [111]: import h5py
In [112]: f=h5py.File('test.h5','w')
In [113]: 
In [113]: g = f.create_group('test')
In [114]: g.attrs.create('converter',x)
Traceback (most recent call last):
...
TypeError: Object dtype dtype('O') has no native HDF5 equivalent

As noted in the comment, numpy object dtype is problematic when writing to h5py. Do you know how the original file was created? There may be some format or structure there that h5py is rendering as a compound dtype with object fields, but which isn't directly writable. I'd have to dig more into the docs (and maybe the original file) to learn more.

https://docs.h5py.org/en/stable/special.html

I can write that data as a more conventional structured array:

In [120]: y=np.array([(b'LateralCat', b'Pixels', np.array([0.        , 2.00097752,
     ...: 0.        , 0.        ]))],
     ...:       dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', 'fl
     ...: oat',4)])
In [121]: y
Out[121]: 
array([(b'LateralCat', b'Pixels', [0.        , 2.00097752, 0.        , 0.        ])],
      dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', '<f8', (4,))])

In [122]: g.attrs.create('converter',y)
In [125]: g.attrs['converter']
Out[125]: 
array([(b'LateralCat', b'Pixels', [0.        , 2.00097752, 0.        , 0.        ])],
      dtype=[('Category', 'S20'), ('BaseUnit', 'S20'), ('Parameters', '<f8', (4,))])

Upvotes: 1

Related Questions