user2657817
user2657817

Reputation: 652

Distribute a vector over a column in list of lists in R

I have a vector with over 8 million elements that looks as follows:

[1]  49.99988  50.19328  50.19342  50.19348

I also have a list of lists of length 1421, where the third column is an all-zero column. I would like to distribute the vector over the third column in the list of lists, such that the 8 million elements are spread across the third column in the list of lists. Therefore, summing the nrows of each nested list should give me the length of the initial vector (over 8 million).

I have shown how the list of lists look below, where I want to populate "c" in lists with the values in the vector:

[[1]]
        a       b       c
 1:   49        0        0
 2:   50.0      0.31     0
 3:   50.1      0.018    0 
 ...

The current approach I have in mind is to store the nrows of each element in the list of lists and use the nrows to determine how many elements of the vector should go into the corresponding sub-list. Is there a faster way to do this?

Upvotes: 1

Views: 183

Answers (1)

thelatemail
thelatemail

Reputation: 93833

If you have something like:

lofl <- rep(list(data.frame(a=1:3,b=2:4,c=0)), 2)
lofl
#[[1]]
#  a b c
#1 1 2 0
#2 2 3 0
#3 3 4 0
#
#[[2]]
#  a b c
#1 1 2 0
#2 2 3 0
#3 3 4 0
vec <- 1:6

Then you can do:

Map(
  replace,
  lofl,
  "c",
  split(vec, rep(seq_along(lofl), sapply(lofl,nrow))) 
)
#[[1]]
#  a b c
#1 1 2 1
#2 2 3 2
#3 3 4 3
#
#[[2]]
#  a b c
#1 1 2 4
#2 2 3 5
#3 3 4 6

This splits your vec into chunks that fit into each part of the list-of-lists (lofl here), and then uses Map to loop over each part of lofl and replace the "c" variable with the necessary chunk.

If you have multiple lists in each list and just want to spread the value over the first part of each list, you can use similar logic after some subsetting:

lofl <- rep(list(rep(list(data.frame(a=1:3,b=2:4,c=0)), 2)), 2)
Map(
  function(L,cn,val) {L[[1]][cn] <- val; L},
  lofl,
  "c",
  split(vec, rep(seq_along(lofl), sapply(lofl, function(x) nrow(x[[1]]))))
)
#[[1]]
#[[1]][[1]]
#  a b c
#1 1 2 1
#2 2 3 2
#3 3 4 3
#
#[[1]][[2]]
#  a b c
#1 1 2 0
#2 2 3 0
#3 3 4 0
#
#
#[[2]]
#[[2]][[1]]
#  a b c
#1 1 2 4
#2 2 3 5
#3 3 4 6
#
#[[2]][[2]]
#  a b c
#1 1 2 0
#2 2 3 0
#3 3 4 0

Upvotes: 2

Related Questions