gota
gota

Reputation: 2659

How to inspect .h5 file in Python

How do I see what variables, datasets, etc. a given .h5 file has in Python?

I can read the file by running this

import h5py
f = h5py.File(filename, 'r')

How can I now see which variables my .h5 file have?

Running f.keys() outputs the non-informative

KeysView(<HDF5 file filename (mode r)>)

In Matlab I simply call h5disp(filename) but would like to know how to do it in Python

Upvotes: 6

Views: 10920

Answers (4)

Thedal
Thedal

Reputation: 13

This prints out data structure in a hierarchy both groups and datasets contained within the h5 file.

import h5py

def printGroup(group):
    for i in list(group. keys()):
        try:
            if list(group[i].keys()):
                print(f"{group.name}/{i}/")
                printGroup(group[i])
        except:
            print(group[i].name, group[i].dtype, group[i].shape)

print("\n\nData File Structure")
f = h5py.File('filePath.h5', 'r')
printGroup(f)

Upvotes: 1

A. Ahmed
A. Ahmed

Reputation: 93

I came across this question while trying to figure out a way to display every single node in h5 file to be able to extract only wanted nodes with corresponding datasets.

I think this is very simple yet easy to understand (for me) snippet:

h5 = h5py.File(filename, 'r')
def hierarchy(d):
    for item in d:
        if ' 0 member' in str(d[item]):
            print(d[item].name, ['empty group'])
        if isinstance(d[item], h5py.Group):
            hierarchy(d[item])
        else: #Dataset
            print(d[item].name, ['dataset'])
hierarchy(h5)

Hence I will be using this in GUI application, I am going to enable selecting only 'dataset' flagged items.

Upvotes: 2

Imanol Luengo
Imanol Luengo

Reputation: 15889

Maybe overkill, but I had this and might be useful for someone:

from __future__ import print_function

def scan_hdf5(path, recursive=True, tab_step=2):
    def scan_node(g, tabs=0):
        print(' ' * tabs, g.name)
        for k, v in g.items():
            if isinstance(v, h5.Dataset):
                print(' ' * tabs + ' ' * tab_step + ' -', v.name)
            elif isinstance(v, h5.Group) and recursive:
                scan_node(v, tabs=tabs + tab_step)
    with h5.File(path, 'r') as f:
        scan_node(f)

And simple input:

>>> scan_hdf5('/tmp/dummy.h5')
/
   - /d1
   /g1
     - /g1/d2
     - /g1/d3
   /g2
     - /g2/d4
     /g2/g3
       - /g2/g3/d5

Or an alternative version that returns the elements in something more usable:

def scan_hdf52(path, recursive=True, tab_step=2):
    def scan_node(g, tabs=0):
        elems = []
        for k, v in g.items():
            if isinstance(v, h5.Dataset):
                elems.append(v.name)
            elif isinstance(v, h5.Group) and recursive:
                elems.append((v.name, scan_node(v, tabs=tabs + tab_step)))
        return elems
    with h5.File(path, 'r') as f:
        return scan_node(f)

with returns:

>>> scan_hdf5_2('/tmp/dummy.h5')
[u'/d1',
 (u'/g1', [u'/g1/d2', u'/g1/d3']),
 (u'/g2', [u'/g2/d4', (u'/g2/g3', [u'/g2/g3/d5'])])]

Upvotes: 8

Astrom
Astrom

Reputation: 767

Did you try?

print(list(f.keys()))

That should give you all the group inside your hdf5 file. You can do the same for the datasets if f is a group.

Upvotes: 7

Related Questions