Reputation: 305
I have a list of members by signup month, what I'd like to do is create a data frame with total members by month.
Raw Data
month.list <- structure(c(18444, 18687, 18475, 18506, 18536, 18567, 18597,18718, 18659, 18628, 18779, 18748), class = "Date")
total.membership.working <- structure(list(`Mem Account` = c(26137295, 26139796, 26400007,26400455, 26402031, 26402078, 26402239, 1092287142, 1092295228,1092473120), Month = structure(c(18444, 18687, 18444, 18444,18475, 18475, 18444, 18779, 18779, 18779), class = "Date")), row.names = c(NA,-10L), groups = structure(list(`Mem Account` = c(26137295, 26139796,26400007, 26400455, 26402031, 26402078, 26402239, 1092287142,1092295228, 1092473120), .rows = structure(list(1L, 2L, 3L, 4L,5L, 6L, 7L, 8L, 9L, 10L), ptype = integer(0), class = c("vctrs_list_of","vctrs_vctr", "list"))), row.names = c(NA, -10L), class = c("tbl_df","tbl", "data.frame"), .drop = TRUE), class = c("grouped_df","tbl_df", "tbl", "data.frame"))
I have written a for loop that will accomplish this, but I'm hoping to find a Tidy way to do it without the loop.
For Loop
total.membership <- data.frame()
for(i in 1:length(month.list)) {
foo <- total.membership.working %>%
ungroup() %>%
filter(Month <= month.list[i]) %>%
summarise(Month = max(Month),
total_membership = n_distinct(`Mem Account`))
total.membership <- total.membership %>%
bind_rows(foo)
}
Desired Output
total.membership <- structure(list(Month = structure(c(18444, 18687, 18475, 18506,18536, 18567, 18597, 18628, 18779, 18748), class = "Date"), total_membership = c(45886L,58128L, 47878L, 49214L, 51119L, 53390L, 55200L, 56299L, 60503L,59583L)), row.names = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 10L, 11L,12L), class = "data.frame")
> total.membership
Month total_membership
1 2020-07-01 45886
2 2021-03-01 58128
3 2020-08-01 47878
4 2020-09-01 49214
5 2020-10-01 51119
6 2020-11-01 53390
7 2020-12-01 55200
8 2021-04-01 58902
9 2021-02-01 57238
10 2021-01-01 56299
11 2021-06-01 60503
12 2021-05-01 59583
Upvotes: 0
Views: 75
Reputation: 388982
Try this code to calculate cumulative unique accounts in each month.
library(dplyr)
total.membership.working %>%
ungroup %>%
arrange(Month) %>%
mutate(cum_n = cumsum(!duplicated(`Mem Account`))) %>%
group_by(Month) %>%
summarise(cum_unique_entries = max(cum_n))
# Month cum_unique_entries
# <date> <int>
#1 2020-07-01 4
#2 2020-08-01 6
#3 2021-03-01 7
#4 2021-06-01 10
Upvotes: 1