Zach
Zach

Reputation: 30301

Most efficient list to data.frame method, when the list is a list of rows

This question covers the case where I have a list of columns, and I wish to turn them into a data.frame. What if I have a list of rows, and I wish to turn them into a data.frame?

rowList <- lapply(1:500000,function(x) sample(0:1,300,x))

The naive way to solve this is using rbind and as.data.frame, but we can't even get past the rbind step:

>Data <- do.call(rbind,vectorList)
Error: cannot allocate vector of size 572.2 Mb

What is a more efficient to do this?

Upvotes: 3

Views: 166

Answers (2)

mdsumner
mdsumner

Reputation: 29477

Try direct coercion to matrix, by relying on the column major aspect of R arrays:

Data <- matrix(unlist(vectorList), ncol = length(vectorList[[1]]), byrow = TRUE)

If that also does not work you do not have the resources to copy this thing, so consider creating the matrix first and populating it column by column.

Upvotes: 1

Joshua Ulrich
Joshua Ulrich

Reputation: 176638

It would probably be fastest / most efficient to unlist your list and fill a matrix:

> m <- matrix(unlist(vectorList), ncol=300, nrow=length(vectorList), byrow=TRUE)

But you're going to need ~6GB of RAM to do that with integer vectors and ~12GB of RAM to do it with numeric vectors.

> l <- integer(5e6*300)
> print(object.size(l),units="Gb")
5.6 Gb

Upvotes: 5

Related Questions