Reputation: 2764
I have a large .rds
file saved and I trying to directly import .rds
file to h2o frame using some functionality, because it is not feasible for me to read that file in R enviornment and then use as.h2o
function to convert.
I am looking for some fast and efficient way to deal with it.
My attempts:
h2o.import()
with parse=T
.
Due to memory constraint I was not able to save complete dataframe.Please suggest me any efficient way to do it.
Any suggestions would be highly appreciated.
Upvotes: 2
Views: 514
Reputation: 8819
The native read/write functionality in R is not very efficient, so I'd recommend using data.table for that. Both options below make use of data.table in some way.
First, I'd recommend trying the following: Once you install the data.table package, and load the h2o library, set options("h2o.use.data.table"=TRUE)
. What that will do is make sure that as.h2o()
uses data.table underneath for the conversion from an R data.frame to an H2O Frame. Something to note about how as.h2o()
works -- it writes the file from R to disk and then reads it back again into H2O using h2o.importFile()
, H2O's parallel file-reader.
There is another option, which is effectively the same thing, though your RAM doesn't need to store two copies of the data at once (one in R and one in H2O), so it might be more efficient if you are really strapped for resources.
Save the file as a CSV or a zipped CSV. If you are having issues saving the data frame to disk as a CSV, then you should make sure you're using an efficient file writer like data.table::fwrite()
. Once you have the file on disk, read it directly into H2O using h2o.importFile()
.
Upvotes: 5