check for overlapping intervals start and end times

Question

I have this data frame sorted by END TIME:

 df = data.frame(ID= c(1,1,1,1,1,1,1),   NumberInSequence= c(1,2,3,4,5,6,7), 
                 StartTime = as.POSIXct(c("2016-01-15 18:02:11 GMT","2016-01-15 18:10:33 GMT","2016-01-15 18:25:08 GMT",
                                               "2016-01-15 18:33:56 GMT","2016-01-15 18:21:03 GMT","2016-01-15 19:55:09 GMT","2016-01-15 19:57:03 GMT"))  ,
                        EndTime = as.POSIXct(c("2016-01-15 18:02:17 GMT","2016-01-15 18:10:39 GMT","2016-01-15 18:25:14 GMT",
                                               "2016-01-15 18:34:02 GMT","2016-01-15 19:53:17 GMT","2016-01-15 19:56:15 GMT","2016-01-15 19:58:17 GMT"))
                       )

Each row is a time interval with a start time and end time

df

 ID NumberInSequence           StartTime             EndTime
1  1                1 2016-01-15 18:02:11 2016-01-15 18:02:17
2  1                2 2016-01-15 18:10:33 2016-01-15 18:10:39
3  1                3 2016-01-15 18:25:08 2016-01-15 18:25:14
4  1                4 2016-01-15 18:33:56 2016-01-15 18:34:02
5  1                5 2016-01-15 18:21:03 2016-01-15 19:53:17
6  1                6 2016-01-15 19:55:09 2016-01-15 19:56:15
7  1                7 2016-01-15 19:57:03 2016-01-15 19:58:17

Then I use dplyr to add a couple fields that calculate the next start time and the wait time which is the difference between the NextStartTime and EndTime. This creates the "WaitTime" column which works in most cases unless there are overlapping inverals.

   df %>% group_by(ID) %>% 
      mutate(
      NextStartTime = lead(StartTime)[ifelse(lead(NumberInSequence) == (NumberInSequence + 1), TRUE, NA)] ,
      WaitTime = difftime(NextStartTime,EndTime, units = 's')
      #max_s = max(StartTime) #,
     # cum_max_s = as.POSIXct(cummin(as.numeric(StartTime)),origin="1970-01-01")
      )


  ID NumberInSequence           StartTime             EndTime       NextStartTime  WaitTime
1  1                1 2016-01-15 18:02:11 2016-01-15 18:02:17 2016-01-15 18:10:33  496 secs
2  1                2 2016-01-15 18:10:33 2016-01-15 18:10:39 2016-01-15 18:25:08  869 secs
3  1                3 2016-01-15 18:25:08 2016-01-15 18:25:14 2016-01-15 18:33:56  522 secs
4  1                4 2016-01-15 18:33:56 2016-01-15 18:34:02 2016-01-15 18:21:03 -779 secs
5  1                5 2016-01-15 18:21:03 2016-01-15 19:53:17 2016-01-15 19:55:09  112 secs
6  1                6 2016-01-15 19:55:09 2016-01-15 19:56:15 2016-01-15 19:57:03   48 secs
7  1                7 2016-01-15 19:57:03 2016-01-15 19:58:17                   NA secs

Now I need to add a column called "FLAG" with value being OK or NOT OK where

"OK" means the interval IS NOT enitrely OR partially within another interval either. So intervals with "OK" have no overlap with other intervals.

"NOT OK" means the interval IS either partially OR entirely withing another interval. So intervals with "NOT OK" have overlap with other intervals.

I have the intervals below and what the result of the FLAG column should be with a short description

 StartTime             EndTime              FLAG
2016-01-15 18:02:11 2016-01-15 18:02:17     OK - this interval does not overlap with other intervals
2016-01-15 18:10:33 2016-01-15 18:10:39     OK - this interval does not overlap with other intervals
2016-01-15 18:25:08 2016-01-15 18:25:14     NOT OK - this inerval is within the  18:21:03 start time interval 
2016-01-15 18:33:56 2016-01-15 18:34:02     NOT OK - this inerval is within the  18:21:03 start time interval 
2016-01-15 18:21:03 2016-01-15 19:53:17     NOT OK  - this interval contains other intervals 
2016-01-15 19:55:09 2016-01-15 19:56:15     OK - this interval does not overlap with other intervals
2016-01-15 19:57:03 2016-01-15 19:58:17     OK - this interval does not overlap with other intervals

I was looking at using cummin in or cummax in dplyr.....maybe....

cum_max_s = as.POSIXct(cummin(as.numeric(StartTime)),origin="1970-01-01")

check for overlapping intervals start and end times

Answers (1)

Related Questions