Reputation: 163
I have a problem to which I found an inelegant solution, and want to know if there is a better way of doing this. (Using python 3.6)
I want to store a set of results from experiments in different groups of an .hdf5
file. But I then want to be able to open the file, iterate over all the groups and only get the datasets from groups of a specific kind.
The inelegant solution I found is to keep the information to distinguish the groups in the group name. For instance a 01
in "ExpA01"
.
Code to generate the file:
import h5py
import numpy as np
if __name__ == "__main__":
# name of the file
FileName = "myFile.hdf5"
# open the file
myFile = h5py.File(FileName, 'w')
# list of groups
NameList = ["ExpA01", "ExpA02", "ExpB01", "ExpB02"]
for name in NameList:
# create new group with the name from the nameList
myFile.create_group(name)
# create random data
dataset = np.random.randint(0, 10, 10)
# add data set to the group
myFile[name].create_dataset("x", data=dataset)
myFile.close() # close the file
Now I want to only read the data from the groups that end in "01"
. To do so, I basically read the information from the group name myFile[k].name.split("/")[-1][-2::] == "01"
.
Code for reading the file:
import h5py
import numpy as np
if __name__ == "__main__":
FileName = "myFile.hdf5"
# open the file
myFile = h5py.File(FileName, 'r')
for k in myFile.keys(): # loop over all groups
if (myFile[k].name.split("/")[-1][-2::] == "01"):
data = np.zeros(myFile[k]["x"].shape)
myFile[k]["x"].read_direct(data)
print(data)
myFile.close()
In short, writing distinguishing information into the group name and then slicing the string is an ugly solution.
What is a better way of doing this?
Thanks for reading.
Upvotes: 0
Views: 946
Reputation: 8006
Have you considered adding an attribute to each group?
Then you could filter groups based on a test of attribute value.
There are no limitations on attribute data type.
My example uses a string, but they can be ints or floats.
# Quick example to create a group attribute, then retrieve:
In [3]: h5f = h5py.File('attr_test.h5','w')
In [4]: grp = h5f.create_group('group1')
In [5]: h5f['group1'].attrs['key']='value'
...:
In [6]: get_value = h5f['group1'].attrs['key']
In [7]: print (get_value)
value
I thought I'd add another example with 2 different values for the attribute.
It creates 26 groups named group_a
thru group_z
, and sets the key
attribute to vowel
for a/e/i/o/u
and consonant
for all other letters.
vowels = 'aeiouAEIOU'
h5f = h5py.File('attr_test.h5','w')
for ascii in range(97,123):
grp = h5f.create_group('group_'+chr(ascii))
if chr(ascii) in vowels:
grp.attrs['key']='vowel'
else :
grp.attrs['key']='consonant'
for grp in h5f.keys() :
get_value = h5f[grp].attrs['key']
print (grp,':',get_value)
Upvotes: 1