Reputation: 1586
I'd like to create a summary report from dataframe df
where each row is the cumulative sum of column A
based on columns B
(where C
is another id column). Below is the data and output:
set.seed(154)
df <- data.frame(B = append(append(rep(1,10),rep(2,10)),rep(3,10)),
C = rep(1:10,3),
A = sample(0:10,30,replace=T)) %>% arrange(B,C)
output:
What I wrote was
df %>% arrange(B) %>% group_by(B) %>%
transmute(test =sum(cumsum(A))) %>% unique()
But it's just summing each column B and not the cumulative.
Upvotes: 2
Views: 6541
Reputation: 819
You need to use the function cumsum
after a group_by(B)
, i.e.:
library(tidyverse)
df %>%
group_by(B) %>%
mutate(A_cum_sum = cumsum(A))
Note that the arrange(B)
is irrelevant because your data is grouped by B. From context I deduce that only arrange(C)
is important and you already used it in the preparation of your df
so it is not needed again.
Upvotes: 0
Reputation: 887048
May be we neeed to get the sum
of 'A' by 'B' and then get the cumulative sum
library(dplyr)
df %>%
group_by(B) %>%
summarise(A = sum(A)) %>%
mutate(A = cumsum(A))
Upvotes: 2