Tyu1990
Tyu1990

Reputation: 167

R: Rolling sum on a non standard window

I have an irregular time series, and i'm trying to account with R a rolling sum on 3 month window for each operation associated to an ID.

data are structured as follow

ID    Operation    date         value
A         1       01/01/2017      0
A         2       01/02/2017      1
A         3       01/06/2017      1
A         4       01/09/2017      0
B         1       01/03/2017      0
B         2       01/05/2017      1
B         3       01/09/2017      0
B         4       01/10/2017      1

i'm looking for this output

ID    Operation    date         value   cumsum
A         1       01/01/2017      0        0
A         2       01/02/2017      1        1
A         3       01/06/2017      1        1
A         4       01/09/2017      0        1
B         1       01/03/2017      0        0
B         2       01/05/2017      1        1
B         3       01/09/2017      1        1
B         4       01/10/2017      1        2

now i'm using this script

DB<-DB[with(DB,order(ID,date)),]
DB<-DB %>% group_by(ID) %>% mutate(cumsum = cumsum(value)) 

but it sum value for all past operation. how can i introduce the 3 month rolling sum?

Upvotes: 0

Views: 624

Answers (1)

AntoniosK
AntoniosK

Reputation: 16121

It's not possible to flag in advance your 3 month windows, because you want to go back 3 months from every date in your dataset and that means that your reference point (date) changes every time. Therefore you need a function that takes that into account and apply it on every row.

library(lubridate)
library(dplyr)

# sample dataset
dt = read.table(text="ID    Operation    date         value
                A         1       01/01/2017      0
                A         2       01/02/2017      1
                A         3       01/06/2017      1
                A         4       01/09/2017      0
                B         1       01/03/2017      0
                B         2       01/05/2017      1
                B         3       01/09/2017      1
                B         4       01/10/2017      1", header=T, stringsAsFactors=F)

# function that goes 3 months back from a given date and a given ID
f = function(ID_input, date_input) { 
  enddate = date_input
  startdate = date_input - months(3)
  sum((dt %>% filter(ID == ID_input & date >= startdate & date <= enddate))$value) }

f = Vectorize(f)

# update date column
dt$date = dmy(dt$date)

# run function for every row
dt %>% mutate(sumvalue = f(ID, date))


#   ID Operation       date value sumvalue
# 1  A         1 2017-01-01     0        0
# 2  A         2 2017-02-01     1        1
# 3  A         3 2017-06-01     1        1
# 4  A         4 2017-09-01     0        1
# 5  B         1 2017-03-01     0        0
# 6  B         2 2017-05-01     1        1
# 7  B         3 2017-09-01     1        1
# 8  B         4 2017-10-01     1        2

Upvotes: 1

Related Questions