mr4ase
mr4ase

Reputation: 23

How to count hours in the given intervals

I have a dataset with start-time and end-time of using the service. About 1000 rows totally. I need to count the number of hours in any given interval.

Dataset:

   court_number       start_time         end_time  service
1:      court 2 2020-03-01 11:00:00 2020-03-01 12:30:00  booking
2:      court 3 2020-03-01 12:30:00 2020-03-01 13:30:00 coaching
3:      court 1 2020-03-01 11:00:00 2020-03-01 13:00:00  booking
4:      court 5 2020-03-01 12:00:00 2020-03-01 16:00:00  booking
5:      court 5 2020-03-01 16:30:00 2020-03-01 18:30:00 coaching
library(data.table)
dt <- data.table(court_number = c('court 2','court 3','court 1','court 5','court 5'),
            start_time = c('2020-03-01 11:00:00', '2020-03-01 12:30:00', '2020-03-01 11:00:00', '2020-03-01 12:00:00', '2020-03-01 16:30:00'),
            end_time = c('2020-03-01 12:30:00', '2020-03-01 13:30:00', '2020-03-01 13:00:00', '2020-03-01 16:00:00', '2020-03-01 18:30:00'),
            service = c('booking','coaching','booking','booking','coaching'))

I want to count, for example, hours in the interval from "12:00" to "17:00". So, I need to create a column with hours from 12 to 17:

   court_number       start_time         end_time  service interval_12_17
1:      court 2 2020-03-01 11:00:00 2020-03-01 12:30:00  booking            0,5
2:      court 3 2020-03-01 12:30:00 2020-03-01 13:30:00 coaching              1
3:      court 1 2020-03-01 11:00:00 2020-03-01 13:00:00  booking              1
4:      court 5 2020-03-01 12:00:00 2020-03-01 16:00:00  booking              4
5:      court 5 2020-03-01 16:30:00 2020-03-01 18:30:00 coaching            0,5

I've read a lot of similar resolved issues here at stackoverflow, but due to the fact that I am newbie to R, it didn't help me =)

Upvotes: 2

Views: 45

Answers (1)

akrun
akrun

Reputation: 887831

We convert the 'time' columns to Datetime class, get the difftime of pmin of the 'Time' for 'end_time' and "17:00:00" and the pmax of the 'start_time' and "12:00:00"

library(dplyr)
library(lubridate)
library(data.table)
dt %>%
    mutate_at(vars(ends_with('time')), ymd_hms) %>% 
    mutate(interval_12_17 = difftime(pmin(as.ITime(end_time), 
           as.ITime("17:00:00")),
         pmax(as.ITime(start_time),as.ITime("12:00:00")), unit = 'hour'))
#   court_number          start_time            end_time  service interval_12_17
#1:      court 2 2020-03-01 11:00:00 2020-03-01 12:30:00  booking      0.5 hours
#2:      court 3 2020-03-01 12:30:00 2020-03-01 13:30:00 coaching      1.0 hours
#3:      court 1 2020-03-01 11:00:00 2020-03-01 13:00:00  booking      1.0 hours
#4:      court 5 2020-03-01 12:00:00 2020-03-01 16:00:00  booking      4.0 hours
#5:      court 5 2020-03-01 16:30:00 2020-03-01 18:30:00 coaching      0.5 hours

Or using data.table

dt[, interval_12_17 := difftime(pmin(as.ITime(end_time), as.ITime("17:00:00")),
        pmax(as.ITime(start_time),as.ITime("12:00:00")), unit = 'hour')][]
#    court_number          start_time            end_time  service interval_12_17
#1:      court 2 2020-03-01 11:00:00 2020-03-01 12:30:00  booking      0.5 hours
#2:      court 3 2020-03-01 12:30:00 2020-03-01 13:30:00 coaching      1.0 hours
#3:      court 1 2020-03-01 11:00:00 2020-03-01 13:00:00  booking      1.0 hours
#4:      court 5 2020-03-01 12:00:00 2020-03-01 16:00:00  booking      4.0 hours
#5:      court 5 2020-03-01 16:30:00 2020-03-01 18:30:00 coaching      0.5 hours

Upvotes: 2

Related Questions