caranbot
caranbot

Reputation: 113

Find the maximum of a variable by overlapping time intervals

I have data of customers with different purchases, where every purchase can have different dunning level for a certain time interval, where time intervals can be overlapping.

My goal is to have the max dunning level of one customer for a certain period of time. For simplicity the time intervals are integers. The data type for from and to is POSIXct, format: "2018-04-10 15:46:38".

library(data.table)

input <- data.table(purchase = c(1,1,1,1
                   ,2,2,2,2
                   ,3,3,3,3
                   ,4,4,4,4)
           , state = c(-1,0,1,-1
                       ,-1,0,1,-1
                       ,-1,1,2,-1
                       ,-1,0,1,2)
           , from = c(0,1,4,6
                      ,2,4,5,8
                      ,3,7,9,10
                      ,0,1,3,9)
           , to = c(1,4,6,99
                    ,4,5,8,99
                    ,7,9,10,99
                    ,1,3,9,99))

From the data above I would like to create the maximum dunning level within new time intervals that are not overlapping.

output <- data.table(state = c(-1,0,1,2)
           , from = c(0,1,4,9)
           , to = c(1,4,9,99))

   state from to
1:    -1    0  1
2:     0    1  4
3:     1    4  9
4:     2    9 99

Upvotes: 2

Views: 86

Answers (1)

Jaap
Jaap

Reputation: 83215

A possible solution:

output <- input[, k := cumsum(c(TRUE, tail(from, -1) >= head(to, -1))), by = state
                ][k == 1, .(from = from[1], to = to[.N]), by = state]

which gives:

> output
   state from to
1:    -1    0  1
2:     0    1  4
3:     1    4  9
4:     2    9 99

Upvotes: 2

Related Questions