Inkyu
Inkyu

Reputation: 17

How to compute overlapping time intervals between groups

I have a dataframe with group id, start time, and end time. I would like to compute overlapped time interval between groups. Here is an example of the dataset;

id <- c("a","a","b","c","c")
start_time <-as.POSIXct(c("2016-05-27 09:30:00","2016-05-27 15:30:00",
                          "2016-05-27 14:30:00","2016-05-27 09:40:00","2016-05-27 15:00:00"),tz= "UTC")
end_time <-as.POSIXct(c("2016-05-27 10:30:00","2016-05-27 17:30:00",
                        "2016-05-27 16:30:00","2016-05-27 09:50:00","2016-05-27 16:00:00"),tz= "UTC")

df <- data.frame(id,start_time,end_time)

and the example dataframe looks like:

            ID             start_time           end_time
1           a        2016-05-27 09:30:00    2016-05-27 10:30:00
2           a        2016-05-27 15:30:00    2016-05-27 17:30:00
3           b        2016-05-27 14:30:00    2016-05-27 16:30:00
4           c        2016-05-27 09:40:00    2016-05-27 09:50:00
5           c        2016-05-27 15:00:00    2016-05-27 16:00:00

The desired result from the suggested dataframe is

            ID_1             ID_2        overlap
1           a                 b         0 + 60 mins
2           a                 c        10 + 0 + 0 + 30 mins
3           b                 c         0 + 60 mins

The last column does not have to show all the cases. It is just to help your understand. Would there be anyway to compute total overlapped time between groups by comparing all the time intervals?

Upvotes: 1

Views: 71

Answers (1)

Sirius
Sirius

Reputation: 5429

Here goes:



library(magrittr)
library(lubridate)
library(tidyr)

df %<>% mutate( interval = interval( start_time, end_time ) )

df %>% full_join( df, by=character(), suffix=c("_1","_2") ) %>%
    mutate( overlap = lubridate::intersect( interval_1, interval_2 ) ) %>%
    filter( id_1 < id_2 ) %>%
    replace_na( list(overlap=0) ) %>%
    group_by( id_1, id_2 ) %>%
    summarise( overlap = paste(paste( as.numeric( overlap ) / 60, collapse=" + " ),"mins"))

various lubridate functions are key to the solution, the rest is just infrastructure

Output:


  id_1  id_2  overlap              
  <chr> <chr> <chr>                
1 a     b     0 + 60 mins          
2 a     c     10 + 0 + 0 + 30 mins
3 b     c     0 + 60 mins  

Upvotes: 1

Related Questions