Reputation: 542
Is there any way to remove a dataset from an hdf5 file, preferably using h5py? Or alternatively, is it possible to overwrite a dataset while keeping the other datasets intact?
To my understanding, h5py can read/write hdf5 files in 5 modes
f = h5py.File("filename.hdf5",'mode')
where mode can be r
for read, r+
for read-write, a
for read-write but creates a new file if it doesn't exist, w
for write/overwrite, and w-
which is same as w
but fails if file already exists. I have tried all but none seem to work.
Any suggestions are much appreciated.
Upvotes: 29
Views: 51519
Reputation: 11
I wanted to make you aware of a development one of my colleagues made and put online in opensource. It's called h5nav
. You can download it with pip install (https://pypi.org/project/h5nav/).
pip install h5nav
h5nav toto.h5
ls
rm the_group_you_want_to_delete
exit
Note that you'll still have to use h5repack to lower the size of your file.
Best, Jérôme
Upvotes: 0
Reputation: 171
I tried this out and the only way I could actually reduce the size of the file is by copying everything to a new file and just leaving out the dataset I was not interested in:
fs = h5py.File('WFA.h5', 'r')
fd = h5py.File('WFA_red.h5', 'w')
for a in fs.attrs:
fd.attrs[a] = fs.attrs[a]
for d in fs:
if not 'SFS_TRANSITION' in d: fs.copy(d, fd)
Upvotes: 7
Reputation: 982
Yes, this can be done.
with h5py.File(input, "a") as f:
del f[datasetname]
You will need to have the file open in a writeable mode, for example append (as above) or write.
As noted by @seppo-enarvi in the comments the purpose of the previously recommended f.__delitem__(datasetname)
function is to implement the del
operator, so that one can delete a dataset using del f[datasetname]
Upvotes: 59
Reputation: 705
I do not understand what has your question to do with the file open modes. For read/write r+ is the way to go.
To my knowledge, removing is not easy/possible, in particular no matter what you do the file size will not shrink.
But overwriting content is no problem
f['mydataset'][:] = 0
Upvotes: 0