Reputation: 176
I have a table with client name, payment month and spend amount, as below:
c_name p_month spend
ABC 201401 100
ABC 201402 150
ABC 201403 230
DEF 201401 110
DEF 201402 190
DEF 201403 300
I want to calculate month on month change (mom_change) and month on month percentage change (mom_per_change) in spend for each client. the desired output would be -
c_name p_month spend mom_change mom_per_change
ABC 201401 100 Blank Blank
ABC 201402 150 50 0.5
ABC 201403 230 80 0.533
DEF 201401 110 Blank Blank
DEF 201402 190 80 0.727
DEF 201403 300 110 0.578
I tried using diff
to calculate change for each client separately. The issue is there are around 10000 clients and it takes a lot of time to compute this using a loop. Any help is much appreciated. Thanks.
Upvotes: 1
Views: 2510
Reputation: 51592
The dplyr
approach,
library(dplyr)
df %>%
group_by(c_name) %>%
mutate(mom_change = c(NA, diff(spend)), mom_per_change = (spend - lag(spend))/lag(spend))
#Source: local data frame [6 x 5]
#Groups: c_name [2]
# c_name p_month spend mom_change mom_per_change
# (fctr) (int) (int) (dbl) (dbl)
#1 ABC 201401 100 NA NA
#2 ABC 201402 150 50 0.5000000
#3 ABC 201403 230 80 0.5333333
#4 DEF 201401 110 NA NA
#5 DEF 201402 190 80 0.7272727
#6 DEF 201403 300 110 0.5789474
Upvotes: 1
Reputation: 214957
Here is a solution using data.table
where blank
is replaced by NA
s:
library(data.table)
setDT(df)[, `:=` (mom_change = c(NA, diff(spend)),
mom_per_change = round(c(NA, diff(spend))/shift(spend), 3)), .(c_name)]
df
c_name p_month spend mom_change mom_per_change
1: ABC 201401 100 NA NA
2: ABC 201402 150 50 0.500
3: ABC 201403 230 80 0.533
4: DEF 201401 110 NA NA
5: DEF 201402 190 80 0.727
6: DEF 201403 300 110 0.579
Upvotes: 1
Reputation: 1954
This works using data.table
and shift()
dt<-data.table(c_name=c("ABC","ABC","ABC","DEF","DEF","DEF"),
pmonth=c(201401,201402,201403,201401,201402,201403),
spend=c(100,150,230,110,190,300))
dt[, mom_change := (spend-shift(spend)), by=c_name]
dt[, mom_per_change := (spend-shift(spend))/shift(spend), by=c_name]
dt
c_name pmonth spend mom_change mom_per_change
1: ABC 201401 100 NA NA
2: ABC 201402 150 50 0.5000000
3: ABC 201403 230 80 0.5333333
4: DEF 201401 110 NA NA
5: DEF 201402 190 80 0.7272727
6: DEF 201403 300 110 0.5789474
Upvotes: 1