Reputation: 2385
I'm trying to use Torch-hdf5 to save some tensors to hdf5!
I want to follow this document very carefully: https://github.com/deepmind/torch-hdf5/blob/master/doc/usage.md
However, on the write to hdf5 part, its exmaple is:
require 'hdf5'
local myFile = hdf5.open('/path/to/write.h5', 'w')
myFile:write('/path/to/data', torch.rand(5, 5))
myFile:close()
I understand the "/path/to/write.h5" refers to the final file, but what is "/path/to/data"? Is it just a random separate path? So I just put down "data/". Then I get this horrific looking error:
HDF5-DIAG: Error detected in HDF5 (1.8.13) thread 0:
#000: H5G.c line 287 in H5Gcreate2(): no name
major: Invalid arguments to routine
minor: Bad value
HDF5-DIAG: Error detected in HDF5 (1.8.13) thread 0:
#000: H5I.c line 2245 in H5Iget_name(): can't retrieve object location
major: Object atom
minor: Can't get value
#001: H5Gloc.c line 253 in H5G_loc(): invalid object ID
major: Invalid arguments to routine
minor: Bad value
Does hdf5 store data and instruction file seperately? Is that why we pass in two paths?
Upvotes: 4
Views: 3413
Reputation: 4364
I am an HDF5 developer, not a Torch developer, so I don't know exactly how Torch works, but I can point out that HDF5 allows users to create hierarchical 'groups' inside an HDF5 file (hence the H in HDF5). These are represented in the same way as file paths on POSIX systems. In /path/to/data, 'path' and 'to' would be HDF5 groups and 'data' would either be an HDF5 dataset or possibly an HDF5 group in which Tensor would store one or more datasets with standard names (a quick perusal of the Torch makes it look like the former).
Upvotes: 0
Reputation: 1477
The first path is the path to the actual file on disk. This is where everything is stored.
local myFile = hdf5.open('/path/to/write.h5', 'w')
The second path i.e. the data path, is the path of key-names within the file, that leads to the tensor. Hdf5 stores data as a dictionary of dictionaries, so /path/to/data represents a global dictionary key named "path", which leads to a dictionary key named "to", which leads to the final key, "data", which then leads to the tensor. This can be accessed as hdf5Data["path"]["to"]["data"], when then hdf5 file is loaded.
myFile:write('/path/to/data', torch.rand(5, 5))
Hope this helps.
Upvotes: 0