Jorge Leitao
Jorge Leitao

Reputation: 20133

What numpy dtypes h5py accepts?

I am serializing a Python dictionary with numpy types to an H5 file.

generally, the code is

for key, value in dict.items():
    if isinstance(value, str):
        f.attrs[key] = value.encode('utf-8')
    elif isinstance(value, XXXXXX):
        param_dset = f.create_dataset(key, value.shape, dtype=value.dtype)
        if not value.shape:
            # scalar
            param_dset[()] = value
        else:
            param_dset[:] = value
    elif isinstance(value, dict):
        save_dict_to_hdf5_group(f.create_group(key), value)
    else:
        raise ValueError('Cannot save type "%s" to HDF5' % type(value))

I am struggling with what to put in XXXXXX. Specifically, can I put any numpy type, or does H5 only store specific types?

For example, (np.ndarray, np.int64) would be a choice, but it would miss float32. (np.ndarray, np.generic) would be another choice, but does H5py accept all generic numpy types?

Upvotes: 2

Views: 3549

Answers (2)

Kermit
Kermit

Reputation: 6002

Native HDF5 type          NumPy equivalent
---------------------------------------------
Integer                   dtype("i")

Float                     dtype("f")

Strings (fixed width)     dtype("S10")

Strings (variable width)  h5py.special_dtype(vlen=bytes)   

Compound                  dtype([ ("field1": "i"), ("field2": "f") ])

Enum                      h5py.special_dtype(enum=("i",{"RED":0, "GREEN":1, "BLUE":2}))

Array                     dtype("(2,2)f")

Opaque                    dtype("V10")

Reference                 h5py.special_dtype(ref=h5py.Reference)

https://www.oreilly.com/library/view/python-and-hdf5/9781491944981/ch07.html

Upvotes: 0

Uvar
Uvar

Reputation: 3462

From the h5py documentation:

Fully supported types:

Type       Precisions                                    Notes
Integer    1, 2, 4 or 8 byte, BE/LE, signed/unsigned     
Float      2, 4, 8, 12, 16 byte, BE/LE   
Complex    8 or 16 byte, BE/LE                           Stored as HDF5 struct
Compound   Arbitrary names and offsets   
Strings (fixed-length)  Any length   
Strings (variable-length)   Any length, ASCII or Unicode     
Opaque     (kind ‘V’)   Any length   
Boolean     NumPy 1-byte bool                             Stored as HDF5 enum
Array       Any supported type   
Enumeration Any NumPy integer type                        Read/write as integers
References  Region and object    
Variable length array   Any supported type                See Special Types

Unsupported types:

Type                
HDF5 “time” type     
NumPy “U” strings   
NumPy generic “O”

Upvotes: 2

Related Questions