Calculate time difference by condition

Question

My data contains start and finish times for workers on their shifts. I wish to know the duration of each shift, according to each worker.

The dataset is quite large, many workers and many shifts, so here is a small example:

           TimeStart          TimeFinish ShiftNo       Worker
                                      
1 2017-04-10 00:06:18 2017-04-10 00:06:19      S1 Caleb 
2 2017-04-10 00:19:56 2017-04-10 00:20:16      S1 Caleb 
3 2017-04-10 00:00:00 2017-04-10 00:00:20      S2 Caleb 
4 2017-04-10 00:08:32 2017-04-10 00:08:52      S2 Caleb 
5 2017-04-10 00:25:35 2017-04-10 00:25:55      S2 Caleb 
6 2017-04-10 00:00:00 2017-04-10 00:00:19      S3 Caleb

I wish to calculate the length of each shift, by subtracting the first entry of TimeStart from the last entry of TimeFinish.

Ideally, I would like to do this in dplyr but I don't think this is the correct code?

ShiftDuration <- df %>%
  group_by(Worker, Shift) %>% 
  summarise(Duration = TimeFinish-TimeStart)

Any help would be greatly appreciated.

neilfws · Accepted Answer

You're almost there. Your group_by should be (Worker, ShiftNo) (not Shift, assuming your example data is correct). Presumably you want the minimum start time and maximum finish time, per worker, per shift:

df %>% 
  group_by(Worker, ShiftNo) %>% 
  summarise(Duration = max(TimeFinish) - min(TimeStart))

  Worker ShiftNo      Duration
              
1  Caleb      S1 13.96667 mins
2  Caleb      S2 25.91667 mins
3  Caleb      S3 19.00000 mins

Calculate time difference by condition

Answers (1)

Related Questions