Reputation: 1784
Using Numpy and h5py, it is possible to create ‘compound datatype’ datasets to be stored in an hdf5-file:
import h5py
import numpy as np
#
# Create a new file using default properties.
#
file = h5py.File('compound.h5','w')
#
# Create a dataset under the Root group.
#
comp_type = np.dtype([('fieldA', 'i4'), ('fieldB', 'f4')])
dataset = file.create_dataset("comp", (4,), comp_type)
It is also possible to use various compression filters in a ‘compression pipeline’, among them the ‘scale-offset’ filter:
cmpr_dataset = file.create_dataset("cmpr", (4,), 'i4', scaleoffset=0)
However, it is not clear to me whether and then how it is possible to specify the scale offset filter with specific parameter (e.g., the 0
in the above example) for the different fields of a compound datatype.
More generally, it is not clear to me whether and how any filter can be applied with field-specific parameters.
So, the question are:
Is it possible to apply filters to compound datatype datasets only, or with specific parameters, to a specific field?
If yes, how can this be done, syntax-wise?
My guess (fear) is that the nature of how the compound data is stored (in one ‘column’, instead of each field in its own ‘column’) will prohibit application of such field-specific filters, but I wanted to check, just to be sure.
Upvotes: 1
Views: 793
Reputation: 231385
Besides the h5py
docs, look at the hdf5
docs. They go into more detail. If the underlying file system does not support this, then the numpy interface won't either.
https://support.hdfgroup.org/HDF5/doc/UG/OldHtmlSource/10_Datasets.html#ScaleOffset
Elsewhere it says filters are applied to whole chunks.
The expression defining the compound type is pure numpy
. h5py
must be translating its descriptor into an equivalent hdf5
c-struc description. There are sample c and fortran compound types definitions.
All docs say that this offset
applies only to integer and float types. That can be understood as excluding string, vlen, and compound. What you are hoping is that it would still work with the numeric types inside a compound type. I don't think so.
Upvotes: 1