GuiW
GuiW

Reputation: 19

Calculate time difference in this situation

Here is the data frame in r,

ride      Start time            End time                
 1       2019-09-01 06:02:09    2019-09-01 06:03:09  
 2       2019-09-01 10:25:21    2019-09-01 10:26:22
 3       2019-09-01 11:10:34    2019-09-01 11:11:36
 4       2019-09-01 18:10:45    2019-09-01 18:11:47

And I want to create a column in r to express the difference in time between each ride, for example, the difference in time(hours) between second ride and first ride is the start time of second ride minus the end time of first ride. I want the result like this:

ride      Start time             End time               difftime
 1       2019-09-01 06:02:09    2019-09-01 06:03:09       NA
 2       2019-09-01 10:25:21    2019-09-01 10:26:22 4hours(10:25:21-06:03:09=4hours)
 3       2019-09-01 11:10:34    2019-09-01 11:11:36    0.73 hour(11:10:34-10:26:22)
 4       2019-09-02 00:10:45    2019-09-01 01:11:47    2019-09-02-00:10:45minus2019-09-01 
                                                                                 11:11:36

Thanks in advance!

Upvotes: 0

Views: 140

Answers (2)

psolomon
psolomon

Reputation: 173

Offset the end_time vector by one row. Delete the last element in the end_time vector and add a null element to its beginning.

ride      Start time            End time                
 1       2019-09-01 06:02:09    0  
 2       2019-09-01 10:25:21    2019-09-01 06:03:09
 3       2019-09-01 11:10:34    2019-09-01 10:26:22
 4       2019-09-01 18:10:45    2019-09-01 11:11:36

As such, the end time of ride 1 is matched with the start time of ride 2, and the end time of ride 2 is matched with the start time of ride 3, etc.

You can then add a new column to the data frame using the difftime operator. Here I am using NA for the first value of the end_time vector.

ride <- c(1, 2, 3, 4)
start_time <- as.POSIXct(c('2019-09-01 06:02:09', '2019-09-01 10:25:21', '2019-09-01 11:10:34', '2019-09-01 18:10:45'))
end_time <- as.POSIXct(c(NA, '2019-09-01 06:03:09', '2019-09-01 10:26:22', '2019-09-01 11:11:36'))

ride_data <- data.frame(ride, start_time, end_time)
ride_data$diff_val <- with(ride_data, difftime(start_time, end_time, units = "hours"))

Printing the above data frame:

  ride          start_time            end_time        diff_val
1    1 2019-09-01 06:02:09                <NA>        NA hours
2    2 2019-09-01 10:25:21 2019-09-01 06:03:09 4.3700000 hours
3    3 2019-09-01 11:10:34 2019-09-01 10:26:22 0.7366667 hours
4    4 2019-09-01 18:10:45 2019-09-01 11:11:36 6.9858333 hours

Upvotes: 1

tjebo
tjebo

Reputation: 23727

The biggest challenge was to read your data ;) The difftime problem is fairly straight forward with the tidyverse, in particular {dplyr}.

library(lubridate)
library(tidyverse)

df <- 
read.table(text = "ride      Start time            End time                
 1       2019-09-01 06:02:09    2019-09-01 06:03:09  
 2       2019-09-01 10:25:21    2019-09-01 10:26:22
 3       2019-09-01 11:10:34    2019-09-01 11:11:36
 4       2019-09-01 18:10:45    2019-09-01 18:11:47", header = TRUE) %>%
  transmute(ride = ride, start = ymd_hms(paste(Start, time)), 
            end = ymd_hms(paste(End, time.1))) 

df %>%
  mutate(difftime = difftime(end, lag(start), units = "hours"))
#>   ride               start                 end        difftime
#> 1    1 2019-09-01 06:02:09 2019-09-01 06:03:09        NA hours
#> 2    2 2019-09-01 10:25:21 2019-09-01 10:26:22 4.4036111 hours
#> 3    3 2019-09-01 11:10:34 2019-09-01 11:11:36 0.7708333 hours
#> 4    4 2019-09-01 18:10:45 2019-09-01 18:11:47 7.0202778 hours

Created on 2021-11-14 by the reprex package (v2.0.1)

Upvotes: 0

Related Questions