Oshan
Oshan

Reputation: 176

Calculate month on month change in R

I have a table with client name, payment month and spend amount, as below:

c_name  p_month  spend
  ABC    201401   100
  ABC    201402   150
  ABC    201403   230
  DEF    201401   110
  DEF    201402   190
  DEF    201403   300

I want to calculate month on month change (mom_change) and month on month percentage change (mom_per_change) in spend for each client. the desired output would be -

c_name  p_month  spend  mom_change  mom_per_change
  ABC    201401   100     Blank         Blank
  ABC    201402   150      50            0.5
  ABC    201403   230      80           0.533
  DEF    201401   110     Blank         Blank
  DEF    201402   190      80           0.727
  DEF    201403   300     110           0.578

I tried using diff to calculate change for each client separately. The issue is there are around 10000 clients and it takes a lot of time to compute this using a loop. Any help is much appreciated. Thanks.

Upvotes: 1

Views: 2510

Answers (3)

Sotos
Sotos

Reputation: 51592

The dplyr approach,

library(dplyr)
df %>% 
 group_by(c_name) %>% 
 mutate(mom_change = c(NA, diff(spend)), mom_per_change = (spend - lag(spend))/lag(spend))

#Source: local data frame [6 x 5]
#Groups: c_name [2]

#  c_name p_month spend mom_change  mom_per_change  
#  (fctr)   (int) (int)      (dbl)     (dbl)
#1    ABC  201401   100         NA        NA
#2    ABC  201402   150         50 0.5000000
#3    ABC  201403   230         80 0.5333333
#4    DEF  201401   110         NA        NA
#5    DEF  201402   190         80 0.7272727
#6    DEF  201403   300        110 0.5789474

Upvotes: 1

akuiper
akuiper

Reputation: 214957

Here is a solution using data.table where blank is replaced by NAs:

library(data.table)
setDT(df)[, `:=` (mom_change = c(NA, diff(spend)), 
                  mom_per_change = round(c(NA, diff(spend))/shift(spend), 3)), .(c_name)]
df
   c_name p_month spend mom_change mom_per_change
1:    ABC  201401   100         NA             NA
2:    ABC  201402   150         50          0.500
3:    ABC  201403   230         80          0.533
4:    DEF  201401   110         NA             NA
5:    DEF  201402   190         80          0.727
6:    DEF  201403   300        110          0.579

Upvotes: 1

moman822
moman822

Reputation: 1954

This works using data.table and shift()

dt<-data.table(c_name=c("ABC","ABC","ABC","DEF","DEF","DEF"), 
               pmonth=c(201401,201402,201403,201401,201402,201403), 
               spend=c(100,150,230,110,190,300))

dt[, mom_change := (spend-shift(spend)), by=c_name]

dt[, mom_per_change := (spend-shift(spend))/shift(spend), by=c_name]

dt
   c_name pmonth spend mom_change mom_per_change
1:    ABC 201401   100         NA             NA
2:    ABC 201402   150         50      0.5000000
3:    ABC 201403   230         80      0.5333333
4:    DEF 201401   110         NA             NA
5:    DEF 201402   190         80      0.7272727
6:    DEF 201403   300        110      0.5789474

Upvotes: 1

Related Questions