Reputation: 6713
I have a hdf5
file written using the rhdf5
package. The output of h5ls(myHDF5,all=TRUE)
is as follows:
group name otype dclass dim
0 / char5 H5I_DATASET STRING 1867124
1 / char6 H5I_DATASET STRING 1867124
2 / char7 H5I_DATASET STRING 1867124
3 / dims H5I_DATASET INTEGER 2
4 / headers H5I_DATASET STRING 212
5 / int H5I_DATASET INTEGER 233390500
6 / intorder H5I_DATASET INTEGER 125
7 / real H5I_DATASET FLOAT 156838416
8 / realorder H5I_DATASET INTEGER 84
If I read the headers
object, which is a string vector, in the myHDF5
file as follows: headers<-h5read(myHDF5,"headers")
, it works fine.
But if I try to read a larger string vector as follows: char5<-h5read(myHDF5,"char5")
then R crashes (R Studio
reloads).
The larger string array char5
had been previously stored as follows:
nr<-length(char5)
mxsize<-max(nchar(char5))
h5createDataset(myHDF5,"char5",storage.mode="character",level=9,dims=nr,chunk=10000,size=mxsize)
h5write(char5,myHDF5,"char5)
while the smaller string array headers
had been previously stored as follows:
nc<-length(headers)
mxsize<-max(nchar(headers))
h5createDataset(myHDF5,"headers",storage.mode="character",level=9,dims=nc,chunk=nc,size=mxsize)
h5write(headers,myHDF5,"headers")
The main difference is the chunk
size value used. I changed the chunk
size for the larger string vector to be same the dims
, i.e. chunk=nr
, and R still crashes.
Why could be the reason for R to crash?
Note: R doesn't crash if I read the integer or float data from the myHDF5
file.
Upvotes: 0
Views: 442
Reputation: 11
I had the same problem. A simple solution, although being not perfect is using the package "h5r":
library(h5r)
f <- H5File(h5FilePath)
g <- getH5Group(f, "/")
d <- getH5Dataset(g, "stringArray")[]
Upvotes: 1