user3919790
user3919790

Reputation: 557

Cumulative sum across a nested list

I have a large tibble with one nested list column. Each element of the nested list column has 10,000 iterations and i would like to apply a cumulative sum across these iterations by a grouping variable.

I have created a minimal reproducible example below

tibble(a = list(c(1,2),c(3,4), c(5,6), c(7,8)),
c = c(1,1, 2, 2))

The intended output should be

tibble(a = list(c(1,2),c(4,6), c(5,6), c(12,14)),
c = c(1,1, 2, 2))

Tried the follwoing syntax but its clearly wrong

x <- tibble(a = list(c(1,2),c(4,6), c(5,6), c(7,8)),
c = c(1,1, 2, 2))

x %>% 
group_by(c) %>%
mutate(a = map(a,cumsum))

Any help greatly appreciated. I can potentially spread the data and add across the columns but that would be slow

Upvotes: 1

Views: 369

Answers (2)

tmfmnk
tmfmnk

Reputation: 40121

One base R option could be:

with(tbl, ave(a, c, FUN = function(x) Reduce(`+`, x, accumulate = TRUE)))

[[1]]
[1] 1 2

[[2]]
[1] 4 6

[[3]]
[1] 5 6

[[4]]
[1] 12 14

Upvotes: 2

lroha
lroha

Reputation: 34601

I think you're looking for the following though it doesn't match your desired output for the last two values (can you check that these are correct):

library(dplyr)
library(purrr)
library(tidyr)

df %>%
  group_by(c) %>%
  mutate(x = accumulate(a, `+`)) %>%
  unnest(cols = c(a, x))

# A tibble: 8 x 3
# Groups:   c [2]
      a     c     x
  <dbl> <dbl> <dbl>
1     1     1     1
2     2     1     2
3     3     1     4
4     4     1     6
5     5     2     5
6     6     2     6
7     7     2    12
8     8     2    14

Upvotes: 0

Related Questions