windweller
windweller

Reputation: 2385

Use Torch-hdf5 to save Tensor to Hdf5

I'm trying to use Torch-hdf5 to save some tensors to hdf5!

I want to follow this document very carefully: https://github.com/deepmind/torch-hdf5/blob/master/doc/usage.md

However, on the write to hdf5 part, its exmaple is:

require 'hdf5'
local myFile = hdf5.open('/path/to/write.h5', 'w')
myFile:write('/path/to/data', torch.rand(5, 5))
myFile:close()

I understand the "/path/to/write.h5" refers to the final file, but what is "/path/to/data"? Is it just a random separate path? So I just put down "data/". Then I get this horrific looking error:

HDF5-DIAG: Error detected in HDF5 (1.8.13) thread 0:
  #000: H5G.c line 287 in H5Gcreate2(): no name
    major: Invalid arguments to routine
    minor: Bad value
HDF5-DIAG: Error detected in HDF5 (1.8.13) thread 0:
  #000: H5I.c line 2245 in H5Iget_name(): can't retrieve object location
    major: Object atom
    minor: Can't get value
  #001: H5Gloc.c line 253 in H5G_loc(): invalid object ID
    major: Invalid arguments to routine
    minor: Bad value

Does hdf5 store data and instruction file seperately? Is that why we pass in two paths?

Upvotes: 4

Views: 3413

Answers (2)

Dana Robinson
Dana Robinson

Reputation: 4364

I am an HDF5 developer, not a Torch developer, so I don't know exactly how Torch works, but I can point out that HDF5 allows users to create hierarchical 'groups' inside an HDF5 file (hence the H in HDF5). These are represented in the same way as file paths on POSIX systems. In /path/to/data, 'path' and 'to' would be HDF5 groups and 'data' would either be an HDF5 dataset or possibly an HDF5 group in which Tensor would store one or more datasets with standard names (a quick perusal of the Torch makes it look like the former).

Upvotes: 0

greenberet123
greenberet123

Reputation: 1477

The first path is the path to the actual file on disk. This is where everything is stored.

local myFile = hdf5.open('/path/to/write.h5', 'w')

The second path i.e. the data path, is the path of key-names within the file, that leads to the tensor. Hdf5 stores data as a dictionary of dictionaries, so /path/to/data represents a global dictionary key named "path", which leads to a dictionary key named "to", which leads to the final key, "data", which then leads to the tensor. This can be accessed as hdf5Data["path"]["to"]["data"], when then hdf5 file is loaded.

myFile:write('/path/to/data', torch.rand(5, 5))

Hope this helps.

Upvotes: 0

Related Questions