Reputation: 113
I have a problem regarding a selective read-in routine while using h5py.
f = h5py.File('file.hdf5','r')
data = f['Data']
I have several positive values in the 'Data'- dataset and also some placeholders with -9999.
How I can get only all positive values for calculations like np.min
?
np.ma.masked_array creates a full copy of the array and all the benefits from using h5py are lost ... (regarding memory usage). The problem is, that I get errors if I try to read data sets that exceed 100 millions of values per data set using data = f['Data'][:,0]
Or if this is not possible is something like that possible?
np.place(data[...], data[...] <= -9999, float('nan'))
Thanks in advance
Upvotes: 3
Views: 602
Reputation: 404
You could use:
mask = f['Data'] >= 0
data = f['Data'][mask]
although I am not sure how much memory the mask calculation itself uses.
Upvotes: 1