Reputation: 2617
I am trying to row bind two very big datasets, but I lack sufficient RAM memory.
What I tried is:
A <- fread("A.csv"); B <- fread("B.csv")
gc()
to free the memory.AB <- rbindlist(list(A, B), fill = TRUE)
Then I get the error that I don't have enough RAM: Error: cannot allocate vector of size 1.6 Mb
.
Note that the two datasets have different column names, that is why I need to use fill = TRUE
. Also, the dataset have both numerical and character values.
How can I merge both without running into RAM issues?
Edit
A
is 357896 x 11873B
is 64979 x 877A
are different from dataset B
Upvotes: 0
Views: 244
Reputation: 16981
If the combined dataset can fit into memory, you could try combining the tables in a CSV via fwrite
with append = TRUE
.
library(data.table)
fwrite(
rbindlist(
list(
fread("A.csv", nrows = 1L),
fread("B.csv")
),
fill = TRUE
)[-1],
"AB.csv"
)
fwrite(
fread("A.csv"),
"AB.csv",
append = TRUE
)
# maybe restart R here
AB <- fread("AB.csv", fill = TRUE)
Upvotes: 1