Alf
Alf

Reputation: 2009

Setting and reading dimscale in hdf5 files correctly in python

I am trying to attach dimension scales to datasets I want to store in hdf5 files with python, but get an error when I try to print the attributes after setting them. The relevant code snippet reads as follows:

import h5py
import numpy as np

# create data and x-axis
my_data = np.random.randint(10, size=(100, 200))
x_axis  = np.linspace(0, 1, 100)

h5f = h5.File('my_file.h5','w')
h5f.create_dataset( 'data_1', data=my_data )
h5f['data_1'].dims[0].label = 'm'
h5f['data_1'].dims.create_scale( h5f['x_axis'], 'x' )

# the following line is creating the problems
h5f['data_1'].dims[0].attach_scale( h5f['x_axis'] )

# this is where the crash happens but only if the above line is included
for ii in h5f['data_1'].attrs.items():
    print ii

h5f.close()

The command print(h5.version.info) prints the following output:

Summary of the h5py configuration
---------------------------------

h5py    2.2.1
HDF5    1.8.11
Python  2.7.6 (default, Jun 22 2015, 17:58:13) 
[GCC 4.8.2]
sys.platform    linux2
sys.maxsize     9223372036854775807
numpy   1.8.2

The error message is the following:

Traceback (most recent call last):
  File "HDF_write_dimScales.py", line 16
    for ii in h5f['data_1'].attrs.items():
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/base.py", line 347, in items
    return [(x, self.get(x)) for x in self]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/base.py", line 310, in get
    return self[name]
  File "/usr/lib/python2.7/dist-packages/h5py/_hl/attrs.py", line 55, in __getitem__
    rtdt = readtime_dtype(attr.dtype, [])
  File "h5a.pyx", line 318, in h5py.h5a.AttrID.dtype.__get__ (h5py/h5a.c:4285)
  File "h5t.pyx", line 337, in h5py.h5t.TypeID.py_dtype (h5py/h5t.c:3892)
TypeError: No NumPy equivalent for TypeVlenID exists

Any ideas or hints are appreciated.

Upvotes: 0

Views: 599

Answers (2)

Yossarian
Yossarian

Reputation: 5471

It works with some slight adjusts for me on h5py 2.5.0. The problem may be to do with when you call create_scale. With h5py 2.5.0, I get a KeyError for h5f['x_axis'] in your create_scale() call. To get your example to work, I had to explicitly create the x_axis dataset first.

import h5py
import numpy as np

# create data and x-axis
my_data = np.random.randint(10, size=(100, 200))

# Use a context manager to ensure h5f is closed
with h5py.File('my_file.h5','w') as h5f:
    h5f.create_dataset( 'data_1', data=my_data )

    # Create the x_axis dataset directly in the HDF5 file
    h5f['x_axis']  = np.linspace(0, 1, 100)

    h5f['data_1'].dims[0].label = 'm'

    # Now we can create and attach the scale without problems
    h5f['data_1'].dims.create_scale( h5f['x_axis'], 'x' )
    h5f['data_1'].dims[0].attach_scale( h5f['x_axis'] )

    for ii in h5f['data_1'].attrs.items():
        print(ii)

# Output
#(u'DIMENSION_LABELS', array(['m', ''], dtype=object))
#(u'DIMENSION_LIST', array([array([<HDF5 object reference>], dtype=object),
#       array([], dtype=object)], dtype=object))

If you're still having problems, you may have to upgrade to h5py 2.5.0, which has better handling (although still not perfect) of VLEN types.

Upvotes: 1

hpaulj
hpaulj

Reputation: 231738

This is just a guess, but since the error references TypeVlenID, it may have something to do with an incomplete implementation of vlen in h5py (especially in our version of the module).

Inexplicable behavior when using vlen with h5py

Writing to compound dataset with variable length string via h5py (HDF5)

Upvotes: 0

Related Questions