Math Lover
Math Lover

Reputation: 177

Sum of a variable based on two grouped variables - for the previous year

I have a list of customers, revenue from them through two channels (online & offline) over the last few financial years. I would like to have a variable showing each customer's total revenue (online+offline) for the previous year.

Sample data is shown below with desired variablehighlighted in yellow. Calculation is shown in adjacent column.

Sample Data. Desired column highlighted in yellow. Calculation shown in adjacent column.

I tried grouping by CustomerID & Fin Year, calculate sum of Revenue and use lag() function to get previous year's total revenue but it didn't work.

df %>% group_by(CustomerID, FinYear) %>% mutate(yearly_totalRevenue = sum(Revenue)) %>% mutate(lastyear_totalRevenue = lag(yearly_totalRevenue )) %>%  ungroup() 

Note: As the data volume is in the range of 10M, a memory efficient code (preferably using data.table functionalities) will be highly appreciated.

Thanks in advance.

Edit1: dput() of the sample data is added.

structure(list(CustomerID = c("Cust1", "Cust2", "Cust3", "Cust4", 
"Cust5", "Cust1", "Cust2", "Cust3", "Cust4", "Cust5"), `Fin Year` = 
c("2010/11", 
"2011/12", "2012/13", "2013/14", "2014/15", "2010/11", "2011/12", 
"2012/13", "2013/14", "2014/15"), Channel = c("Online", "Online", 
"Online", "Online", "Online", "Offline", "Offline", "Offline", 
"Offline", "Offline"), Revenue = c(858, 733, 248, 541, 222, 316, 
412, 167, 385, 654)), row.names = c(NA, -10L), class = c("tbl_df", 
"tbl", "data.frame"))

Upvotes: 0

Views: 47

Answers (1)

chinsoon12
chinsoon12

Reputation: 25225

You can try:

setDT(df)[, yearly_totalRevenue := sum(Revenue), .(CustomerID, FinYear)][, 
    lastyear_totalRevenue := shift(yearly_totalRevenue), .(rowid(CustomerID))]

output:

    CustomerID FinYear Channel Revenue yearly_totalRevenue lastyear_totalRevenue
 1:      Cust1 2010/11  Online     858                1174                    NA
 2:      Cust2 2011/12  Online     733                1145                  1174
 3:      Cust3 2012/13  Online     248                 415                  1145
 4:      Cust4 2013/14  Online     541                 926                   415
 5:      Cust5 2014/15  Online     222                 876                   926
 6:      Cust1 2010/11 Offline     316                1174                    NA
 7:      Cust2 2011/12 Offline     412                1145                  1174
 8:      Cust3 2012/13 Offline     167                 415                  1145
 9:      Cust4 2013/14 Offline     385                 926                   415
10:      Cust5 2014/15 Offline     654                 876                   926

Upvotes: 1

Related Questions