user21816
user21816

Reputation: 137

Cannot write int16 data type using the R's rhdf5 package

In R I would like to write a matrix of integers into an HDF5 file ".h5" as an int16 data type. To do so I am using the rhdf5 package. The documentation says that you should set one of the supported H5 data types when creating the dataset. However, even when setting up the int16 data type the result is always int32. Is it possible to store the data as int16 or uint16?

library(rhdf5)

m <- matrix(1,5,5)
outFile <- "test.h5"
h5createFile(outFile)
h5createDataset(file=outFile,"m",dims=dim(m),H5type = "H5T_NATIVE_INT16")
h5write(m,file=outFile,name="m")
H5close()
h5ls(outFile)

The result is:

enter image description here

Upvotes: 0

Views: 129

Answers (2)

Grimbough
Grimbough

Reputation: 136

The code your provided works as expected, but it's a limitation of the h5ls() function in rhdf5 that it doens't report a more details data type. As @r2evans points out, it's technically true that it's an integer, you just want to know a bit more detail that that.

If we run you code and use the h5ls() tool distributed by the HDF5 group we get more information:

library(rhdf5)

m <- matrix(1,5,5)
outFile <- tempfile(fileext = ".h5")
h5createFile(outFile)
h5createDataset(file=outFile,"m", dims=dim(m),H5type = "H5T_NATIVE_INT16")
h5write(m,file=outFile, name="m")

system2("h5ls", args = list("-v", outFile))

## Opened "/tmp/RtmpFclmR3/file299e79c4c206.h5" with sec2 driver.
## m                        Dataset {5/5, 5/5}
##     Attribute: rhdf5-NA.OK {1}
##         Type:      native int
##     Location:  1:800
##     Links:     1
##     Chunks:    {5, 5} 50 bytes
##     Storage:   50 logical bytes, 14 allocated bytes, 357.14% utilization
##     Filter-0:  shuffle-2 OPT {2}
##     Filter-1:  deflate-1 OPT {6}
##     Type:      native short

Here the most important part is the final line which confirms the datatype is "native short" a.k.a native int16.

Upvotes: 0

Billy34
Billy34

Reputation: 2174

Using another library as I did not find rhdf5

library(hdf5r)

m <- matrix(1L,5L,5L)
outFile <- h5file("test.h5")
createDataSet(outFile, "m", m, dtype=h5types$H5T_NATIVE_INT16)

print(outFile)

print(outFile[["m"]])

h5close(outFile)

For the first print (the file)

Class: H5File
Filename: D:\Travail\Projets\SWM\swm.gps\test.h5
Access type: H5F_ACC_RDWR
Listing:
 name    obj_type dataset.dims dataset.type_class
    m H5I_DATASET        5 x 5        H5T_INTEGER

Here we see it displays H5T_INTEGER as the datatype for the dataset m

and the second (the dataset)

Class: H5D
Dataset: /m
Filename: D:\Travail\Projets\SWM\swm.gps\test.h5
Access type: H5F_ACC_RDWR
Datatype: H5T_STD_I16LE
Space: Type=Simple     Dims=5 x 5     Maxdims=Inf x Inf
Chunk: 64 x 64

We can see that it has the right datatype H5T_STD_I16LE

Upvotes: 1

Related Questions