Reputation: 1271
I am trying to have a function apply over a list and iterate over a second variable in the function, in r.
Here is an example:
Create the data
A <- data.frame(var = 1:3, year = 2000:2002)
B <- data.frame(var = 4:6, year = 2000:2002)
C <- data.frame(var = 7:9, year = 2000:2002)
ABC <- list(A, B, C)
> ABC
[[1]]
var year
1 1 2000
2 2 2001
3 3 2002
[[2]]
var year
1 4 2000
2 5 2001
3 6 2002
[[3]]
var year
1 7 2000
2 8 2001
3 9 2002
Write the function: sum
(which simply filters for a start year and sums the 'var' values - sorry this simple function got messier in this example than I had intended).
library(dplyr)
sum <- function(dat, start.year) {
dat %>%
filter(year >= start.year) %>%
select(var) %>%
colSums() %>%
data.frame(row.names = NULL) %>%
rename(var = '.') %>%
mutate(start = start.year)
}
Now I can apply the function to the list (and bind_rows
to get a neat output):
lapply(ABC, sum, 2000) %>%
bind_rows()
var start
1 6 2000
2 15 2000
3 24 2000
What I want to do however is iterate over start.year
creating dataframes for start.year = c(2000, 2001, 2002)
. This would ideally give:
var start
1 6 2000
2 15 2000
3 24 2000
4 5 2001
5 11 2001
6 17 2001
7 3 2002
8 6 2002
9 9 2002
I have looked at map2
, but that talks about using vectors of the same length. That would work in this case, but imagine my list had 4 items in it and only 3 records per list. So assume map2
is doing something different. I also thought about a nested for loop
. When I started writing that however I realized I would be dealing with list.append functions in r and that seemed wrong. I assume this is an easy thing to do. Any help would be appreciated.
Upvotes: 2
Views: 1238
Reputation: 887118
We can do this with a nested lapply/map
library(purrr)
map_dfr(2000:2002, ~ map_dfr(ABC, sum, .x))
# var start
#1 6 2000
#2 15 2000
#3 24 2000
#4 5 2001
#5 11 2001
#6 17 2001
#7 3 2002
#8 6 2002
#9 9 2002
Or inspired from @thelatemail's suggestion with Map
map2_dfr(rep(ABC, 3), rep(2000:2002,each=length(ABC)), sum)
With lapply
do.call(rbind, lapply(2000:2002, function(x) do.call(rbind, lapply(ABC, sum, x))))
# var start
#1 6 2000
#2 15 2000
#3 24 2000
#4 5 2001
#5 11 2001
#6 17 2001
#7 3 2002
#8 6 2002
#9 9 2002
Or as @thelatemail mentioned
do.call(rbind, Map(sum, ABC, start.year=rep(2000:2002,each=length(ABC))))
If the OP's function can be changed, another option is
library(dplyr)
library(tidyr)
map_dfr(ABC, ~ .x %>%
crossing(year2 = 2000:2002) %>%
filter(year >= year2) %>%
group_by(year2) %>%
summarise(var = base::sum(var)))
Or instead of doing this in a list
, we can bind them together with bind_rows
then do a group by sum
after crossing
with the input 'years'
bind_rows(ABC, .id = 'grp') %>%
group_by(grp) %>%
crossing(year2 = 2000:2002) %>%
filter(year >= year2) %>%
group_by(grp, year2) %>%
summarise(var = base::sum(var))
Upvotes: 2