Reputation: 8136
I expected the following code to work, but it doesn't.
import h5py
import numpy as np
with h5py.File('file.hdf5','w') as hf:
dt = h5py.special_dtype(vlen=str)
feature_names = np.array(['a', 'b', 'c'])
hf.create_dataset('feature names', data=feature_names, dtype=dt)
I get the error message TypeError: No conversion path for dtype: dtype('<U1')
. The following code does work, but using a for loop to copy the data seems a bit clunky to me. Is there a more straightforward way to do this? I would prefer to be able to pass the sequence of strings directly into the create_dataset
function.
import h5py
import numpy as np
with h5py.File('file.hdf5','w') as hf:
dt = h5py.special_dtype(vlen=str)
feature_names = np.array(['a', 'b', 'c'])
ds = hf.create_dataset('feature names', (len(feature_names),), dtype=dt)
for i in range(len(feature_names)):
ds[i] = feature_names[i]
Note: My question follows from this answer to Storing a list of strings to a HDF5 Dataset from Python, but I don't consider it a duplicate of that question.
Upvotes: 6
Views: 3169
Reputation: 936
You almost did it, the missing detail was to pass dtype
to np.array
:
import h5py
import numpy as np
with h5py.File('file.hdf5','w') as hf:
dt = h5py.special_dtype(vlen=str)
feature_names = np.array(['a', 'b', 'c'], dtype=dt)
hf.create_dataset('feature names', data=feature_names)
PS: It looks like a bug for me - create_dataset
ignores the given dtype
and don't apply it to the given data
.
Upvotes: 10