Walter
Walter

Reputation: 45464

How to read strings from HDF5 dataset using h5py

I have a HDF5 file which h5dumps as follows (omitting irrelevant content)

HDF5 "file.h5" {
GROUP "/" {
  DATASET "history" {
    DATATYPE  H5T_STRING {
      STRSIZE H5T_VARIABLE;
      STRPAD H5T_STR_NULLTERM;
      CSET H5T_CSET_ASCII;
      CTYPE H5T_C_S1;
    }
    DATASPACE  SIMPLE { ( 1 ) / ( H5S_UNLIMITED ) }
    DATA {
    (0): "some string"
    }
  }
}

and which I'm trying to read from python (3.5) using h5py. My attempt so far is

import h5py
F = h5py.File('file.h5', "r")
H = list()
for x in F['history']:
    H.append(str(x))

but

for x in H:
    print(x)

produces

b'some string'

instead of simply

some string

How can I extract the pure data string? What do I need to do instead of str(x)?

P.S. This is my first python question, so please bear with me.

Upvotes: 3

Views: 6156

Answers (1)

corinna
corinna

Reputation: 649

Just use

H = [x.decode() for x in F['history']]

This list comprehension will return H as a list of strings.

Upvotes: 4

Related Questions