Juan Lozano
Juan Lozano

Reputation: 667

Row-wise cumulative sum over some columns, followed by row-wise division

I have the following data.frame:

New_Sign_ups   Spend   2021-05  2021-06      2021-07  MRR_sum  time
100             100       0       10         100       110     1.5
50              200       10      10          40        60     1.8

First, I need to do a row-wise cumulative sum for columns 3 ~ 5:

New_Sign_ups   Spend   2021-05  2021-06      2021-07  MRR_sum  time
100             100       0       10         110       110     1.5
50              200       10      20          60        60     1.8

Then I need to divide the cumulative sum by the Spend value of that row:

New_Sign_ups   Spend   2021-05  2021-06      2021-07  MRR_sum  time
100             100       0       0.1         1.1       110     1.5
50              200       0.05    0.1         0.3        60     1.8

How can I do this?

Upvotes: 2

Views: 260

Answers (3)

Mohamed Desouky
Mohamed Desouky

Reputation: 4425

  • We can use rowwise paired with c_across from dplyr library , then update the values of the data.frame
library(dplyr)

df <- cohort_table3 |> rowwise() |> 
      mutate(ans = list(cumsum(c_across(3:5))/Spend))

cohort_table3[3:5] <- do.call(rbind , df$ans)
  • output
 New_Sign_ups Spend X2021.05 X2021.06 X2021.07 MRR_sum time
1          100   100     0.00      0.1      1.1     110  1.5
2           50   200     0.05      0.1      0.3      60  1.8

Upvotes: 1

Anoushiravan R
Anoushiravan R

Reputation: 21938

Here is also a tidyverse solution with mostly purrr package functions. Thank you Zheyuan Li for the data:

library(dplyr)
library(purrr)

dat %>%
  mutate(pmap_df(dat, ~ {
    accumulate(c(...)[str_detect(names(dat), '\\d+')], `+`) / c(...)['Spend' == names(dat)]}))

  New_Sign_ups Spend 2021-05 2021-06 2021-07 MRR_sum time
1          100   100    0.00     0.1     1.1     110  1.5
2           50   200    0.05     0.1     0.3      60  1.8

Upvotes: 2

Zheyuan Li
Zheyuan Li

Reputation: 73415

dat <- structure(list(New_Sign_ups = c(100L, 50L), Spend = c(100L, 200L
), `2021-05` = c(0L, 10L), `2021-06` = c(10L, 10L), `2021-07` = c(100L, 
40L), MRR_sum = c(110L, 60L), time = c(1.5, 1.8)), class = "data.frame", 
row.names = c(NA, -2L))

Without using any packages

dat[3:5] <- do.call("cbind", Reduce(`+`, dat[3:5], accumulate = TRUE)) / dat$Spend
#  New_Sign_ups Spend 2021-05 2021-06 2021-07 MRR_sum time
#1          100   100    0.00     0.1     1.1     110  1.5
#2           50   200    0.05     0.1     0.3      60  1.8

Using package matrixStats

library(matrixStats)
dat[3:5] <- rowCumsums(as.matrix(dat[3:5])) / dat$Spend
#  New_Sign_ups Spend 2021-05 2021-06 2021-07 MRR_sum time
#1          100   100    0.00     0.1     1.1     110  1.5
#2           50   200    0.05     0.1     0.3      60  1.8

Upvotes: 3

Related Questions