pajul
pajul

Reputation: 133

Apply function over nested list of dataframes in r

Apologies if this has been asked before - I've found a few answers related to applying functions over nested lists, but haven't managed to find one I can apply to my specific case.

I have a list containing two lists of dataframes:

set.seed(1)

df1 <- data.frame(x = rnorm(10), y = rnorm(10))
df2 <- data.frame(x = rnorm(10), y = rnorm(10))
df3 <- data.frame(x = rnorm(10), y = rnorm(10))

df4 <- data.frame(x = rnorm(20), y = rnorm(20))
df5 <- data.frame(x = rnorm(20), y = rnorm(20))
df6 <- data.frame(x = rnorm(20), y = rnorm(20))

lista <- list(df1, df2, df3)
listb <- list(df4, df5, df6)

list <- list(lista, listb)

I'd like to apply something like the following function over the two lists of dataframes:

f <- function (constant1, constant2, dfa, dfb){
  (constant1 * (sum(dfa$x) + sum(dfa$y))) + (constant2 * (sum(dfb$x) + sum(dfb$y))) 
}

So, for the list defined above, the function would use dfa = df1 and dfb = df4 in the first iteration. For the second iteration, these would become dfa = df2 and dfb = df5, and so-on.

With both constants set as 1, the output should be a list containing three items:

> output
[[1]]
[1] 8.242232

[[2]]
[1] -2.19834

[[3]]
[1] 4.330664

I'm guessing I need mapply to do this, but can't work out how to call the dataframes.

Among many other attempts, I tried the following (which throws the error $ operator is invalid for atomic vectors):

output <- mapply(function(a, b, c, d) f(constant1 = a, constant2 = b, dfa = c, dfb = d),
                 a = 1, b = 1, c = list[[1]][[1]], d = list[[2]][[1]])

Upvotes: 1

Views: 240

Answers (2)

Roman
Roman

Reputation: 17648

a tidyverse solution

library(tidyverse)

foo <- function(x, constant1, constant2){ 
  x %>% 
  bind_rows(.id = "gr") %>% 
  group_by(gr) %>%
  summarise(res= sum(x,y)) %>%  
  mutate(gr1 = rep(1:(n()/2), n()/(n()/2))) %>%  
  group_by(gr1) %>%  
  summarise(res=sum(res[1]*constant1,res[2]*constant2)) %>% 
  pull(res)}

foo(list, constant1 = 1, constant2 = 1)
[1]  8.242232 -2.198340  4.330664

Upvotes: 0

Ronak Shah
Ronak Shah

Reputation: 388862

You can use mapply like this :

mapply(function(a, b) f(constant1 = 1, constant2 = 1, dfa = a, dfb = b), 
                      list[[1]], list[[2]])
#[1]  8.242232 -2.198340  4.330664

Or perhaps better :

mapply(f, list[[1]], list[[2]], MoreArgs = list(constant1 = 1, constant2 = 1))

Upvotes: 3

Related Questions