Reputation: 167
I have an irregular time series, and i'm trying to account with R a rolling sum on 3 month window for each operation associated to an ID.
data are structured as follow
ID Operation date value
A 1 01/01/2017 0
A 2 01/02/2017 1
A 3 01/06/2017 1
A 4 01/09/2017 0
B 1 01/03/2017 0
B 2 01/05/2017 1
B 3 01/09/2017 0
B 4 01/10/2017 1
i'm looking for this output
ID Operation date value cumsum
A 1 01/01/2017 0 0
A 2 01/02/2017 1 1
A 3 01/06/2017 1 1
A 4 01/09/2017 0 1
B 1 01/03/2017 0 0
B 2 01/05/2017 1 1
B 3 01/09/2017 1 1
B 4 01/10/2017 1 2
now i'm using this script
DB<-DB[with(DB,order(ID,date)),]
DB<-DB %>% group_by(ID) %>% mutate(cumsum = cumsum(value))
but it sum value for all past operation. how can i introduce the 3 month rolling sum?
Upvotes: 0
Views: 624
Reputation: 16121
It's not possible to flag in advance your 3 month windows, because you want to go back 3 months from every date in your dataset and that means that your reference point (date) changes every time. Therefore you need a function that takes that into account and apply it on every row.
library(lubridate)
library(dplyr)
# sample dataset
dt = read.table(text="ID Operation date value
A 1 01/01/2017 0
A 2 01/02/2017 1
A 3 01/06/2017 1
A 4 01/09/2017 0
B 1 01/03/2017 0
B 2 01/05/2017 1
B 3 01/09/2017 1
B 4 01/10/2017 1", header=T, stringsAsFactors=F)
# function that goes 3 months back from a given date and a given ID
f = function(ID_input, date_input) {
enddate = date_input
startdate = date_input - months(3)
sum((dt %>% filter(ID == ID_input & date >= startdate & date <= enddate))$value) }
f = Vectorize(f)
# update date column
dt$date = dmy(dt$date)
# run function for every row
dt %>% mutate(sumvalue = f(ID, date))
# ID Operation date value sumvalue
# 1 A 1 2017-01-01 0 0
# 2 A 2 2017-02-01 1 1
# 3 A 3 2017-06-01 1 1
# 4 A 4 2017-09-01 0 1
# 5 B 1 2017-03-01 0 0
# 6 B 2 2017-05-01 1 1
# 7 B 3 2017-09-01 1 1
# 8 B 4 2017-10-01 1 2
Upvotes: 1