Ana
Ana

Reputation: 1586

How to create a running sum summary in R

I'd like to create a summary report from dataframe df where each row is the cumulative sum of column A based on columns B (where C is another id column). Below is the data and output:

set.seed(154)
df <- data.frame(B = append(append(rep(1,10),rep(2,10)),rep(3,10)),
                C = rep(1:10,3),
                A = sample(0:10,30,replace=T)) %>% arrange(B,C)

output:

enter image description here

What I wrote was

df %>% arrange(B) %>% group_by(B) %>%
  transmute(test =sum(cumsum(A))) %>% unique()

But it's just summing each column B and not the cumulative.

Upvotes: 2

Views: 6541

Answers (2)

Adi Sarid
Adi Sarid

Reputation: 819

You need to use the function cumsum after a group_by(B), i.e.:

library(tidyverse)
df %>% 
   group_by(B) %>% 
   mutate(A_cum_sum = cumsum(A))

Note that the arrange(B) is irrelevant because your data is grouped by B. From context I deduce that only arrange(C) is important and you already used it in the preparation of your df so it is not needed again.

Upvotes: 0

akrun
akrun

Reputation: 887048

May be we neeed to get the sum of 'A' by 'B' and then get the cumulative sum

library(dplyr)
df %>% 
  group_by(B) %>% 
  summarise(A = sum(A))  %>% 
  mutate(A = cumsum(A))

Upvotes: 2

Related Questions