Reputation: 677
When opening a .hdf5
file, one can explore the levels, keys and names of the file
in different ways. I wonder if there is a way or a function that displays all the available paths to explore in the .hdf5
. Ultimately showing the whole tree.
Upvotes: 7
Views: 8755
Reputation: 91
I modified Alex44's solution to make it a MWE. And I decided to output a string to make the function more versatile. Also, I prefer to display the shape of the data instead of its length.
import h5py
import numpy as np
def hdf5_tree(hdf5_file: h5py.File | h5py.Group, prefix: str = "") -> str:
"""Return a string containing the tree representation of the HDF5 file.
Args:
hdf5_file (h5py.File | h5py.Group): the HDF5 file object.
prefix (str, optional): the prefix. Defaults to "".
Returns:
str: the full or partial tree representation of the file.
"""
tree_string = ""
items_index = len(hdf5_file)
for key, hdf5_value in hdf5_file.items():
items_index -= 1
branch_symbol = "├──"
prefix_symbol = "|"
if items_index == 0:
branch_symbol = "└──"
prefix_symbol = " "
if isinstance(hdf5_value, h5py.Group):
tree_string += f"{prefix}{branch_symbol} {key}\n"
tree_string += hdf5_tree(hdf5_value, f"{prefix}{prefix_symbol} ")
else:
try:
tree_string += f"{prefix}{branch_symbol} "
tree_string += f"{key} {hdf5_value.shape})\n"
except TypeError:
tree_string += f"{prefix}{branch_symbol} {key} (scalar)\n"
return tree_string
with h5py.File("test.h5", mode="w") as hdf5_file:
hdf5_group = hdf5_file.require_group("top_group")
hdf5_annotations = hdf5_group.require_group("annotations")
hdf5_annotations.require_group("test_0")
hdf5_annotations.require_group("test_1")
hdf5_signals = hdf5_group.require_group("signals")
data = np.asarray([0, 5, 4, 0, 4, 5, 6], dtype=np.float64)
hdf5_signals.create_dataset("EEG_SIGNAL", data=data)
print(hdf5_tree(hdf5_file))
Upvotes: 0
Reputation: 1204
A quick and dirty solution:
import h5py
file = h5py.File('file.hdf5')
file.visit(lambda x: print (x))
Upvotes: 0
Reputation: 3855
For all, who want to stay with the h5py package:
This is not a one-liner from implementation perspective, but it works with the h5py package. With this recursive function you can use it as one-liner:
import h5py
filename_hdf = 'data.hdf5'
def h5_tree(val, pre=''):
items = len(val)
for key, val in val.items():
items -= 1
if items == 0:
# the last item
if type(val) == h5py._hl.group.Group:
print(pre + '└── ' + key)
h5_tree(val, pre+' ')
else:
try:
print(pre + '└── ' + key + ' (%d)' % len(val))
except TypeError:
print(pre + '└── ' + key + ' (scalar)')
else:
if type(val) == h5py._hl.group.Group:
print(pre + '├── ' + key)
h5_tree(val, pre+'│ ')
else:
try:
print(pre + '├── ' + key + ' (%d)' % len(val))
except TypeError:
print(pre + '├── ' + key + ' (scalar)')
with h5py.File(filename_hdf, 'r') as hf:
print(hf)
h5_tree(hf)
Upvotes: 10
Reputation: 11
I wanted to write the hdf5 structures into a text file. So I had to modify @Alex44's code. Now you can store the whole structure as a string in case that's what you want.
import h5py
def h5_tree(val, pre='', out=""):
length = len(val)
for key, val in val.items():
length -= 1
if length == 0: # the last item
if type(val) == h5py._hl.group.Group:
out += pre + '└── ' + key + "\n"
out = h5_tree(val, pre+' ', out)
else:
out += pre + '└── ' + key + f' {val.shape}\n'
else:
if type(val) == h5py._hl.group.Group:
out += pre + '├── ' + key + "\n"
out = h5_tree(val, pre+'│ ', out)
else:
out += pre + '├── ' + key + f' {val.shape}\n'
return out
filename = "dummy.h5"
with h5py.File(filename, "r") as file:
structure = h5_tree(file)
print(structure)
Upvotes: 1
Reputation: 7996
You can also get the file schema/contents without writing any Python code or installing additional packages. If you just want to see the entire schema, take a look at the h5dump
utility from The HDF Group. There are options to control the amount of detail that is dumped. Note: the default option is dump everything. To get a quick/small dump, use :h5dump -n 1 --contents=1 h5filename.h5
.
Another Python pakcage is PyTables. It has a utility ptdump
that is a command line tool to interrogate a HDF file (similar to h5dump
above).
Finally, here are some tips if you want to programmatically access groups and datasets recursively in Python. h5py
and tables
(PyTables) each have methods to do this:
In h5py:
Use the object.visititems(callable)
method. It calls the callable function for each object in the tree.
In PyTables:
PyTables has multiple ways to recursively access groups, datasets and nodes. There are methods that return an iterable (object.walk_nodes
), or return a list (object.list_nodes
). There is also a method that returns an iterable that is not recursive (object.iter_nodes
).
Upvotes: 3
Reputation: 7744
Try using nexuformat
package to list the structure of the hdf5
file.
Install by pip install nexusformat
Code
import nexusformat.nexus as nx
f = nx.nxload(‘myhdf5file.hdf5’)
print(f.tree)
This should print the entire structure of the file. For more on that see this thread. Examples can be found here
Upvotes: 4