Reputation: 19
Here is the data frame in r,
ride Start time End time
1 2019-09-01 06:02:09 2019-09-01 06:03:09
2 2019-09-01 10:25:21 2019-09-01 10:26:22
3 2019-09-01 11:10:34 2019-09-01 11:11:36
4 2019-09-01 18:10:45 2019-09-01 18:11:47
And I want to create a column in r to express the difference in time between each ride, for example, the difference in time(hours) between second ride and first ride is the start time of second ride minus the end time of first ride. I want the result like this:
ride Start time End time difftime
1 2019-09-01 06:02:09 2019-09-01 06:03:09 NA
2 2019-09-01 10:25:21 2019-09-01 10:26:22 4hours(10:25:21-06:03:09=4hours)
3 2019-09-01 11:10:34 2019-09-01 11:11:36 0.73 hour(11:10:34-10:26:22)
4 2019-09-02 00:10:45 2019-09-01 01:11:47 2019-09-02-00:10:45minus2019-09-01
11:11:36
Thanks in advance!
Upvotes: 0
Views: 140
Reputation: 173
Offset the end_time vector by one row. Delete the last element in the end_time vector and add a null element to its beginning.
ride Start time End time
1 2019-09-01 06:02:09 0
2 2019-09-01 10:25:21 2019-09-01 06:03:09
3 2019-09-01 11:10:34 2019-09-01 10:26:22
4 2019-09-01 18:10:45 2019-09-01 11:11:36
As such, the end time of ride 1 is matched with the start time of ride 2, and the end time of ride 2 is matched with the start time of ride 3, etc.
You can then add a new column to the data frame using the difftime operator. Here I am using NA for the first value of the end_time vector.
ride <- c(1, 2, 3, 4)
start_time <- as.POSIXct(c('2019-09-01 06:02:09', '2019-09-01 10:25:21', '2019-09-01 11:10:34', '2019-09-01 18:10:45'))
end_time <- as.POSIXct(c(NA, '2019-09-01 06:03:09', '2019-09-01 10:26:22', '2019-09-01 11:11:36'))
ride_data <- data.frame(ride, start_time, end_time)
ride_data$diff_val <- with(ride_data, difftime(start_time, end_time, units = "hours"))
Printing the above data frame:
ride start_time end_time diff_val
1 1 2019-09-01 06:02:09 <NA> NA hours
2 2 2019-09-01 10:25:21 2019-09-01 06:03:09 4.3700000 hours
3 3 2019-09-01 11:10:34 2019-09-01 10:26:22 0.7366667 hours
4 4 2019-09-01 18:10:45 2019-09-01 11:11:36 6.9858333 hours
Upvotes: 1
Reputation: 23727
The biggest challenge was to read your data ;) The difftime problem is fairly straight forward with the tidyverse, in particular {dplyr}.
library(lubridate)
library(tidyverse)
df <-
read.table(text = "ride Start time End time
1 2019-09-01 06:02:09 2019-09-01 06:03:09
2 2019-09-01 10:25:21 2019-09-01 10:26:22
3 2019-09-01 11:10:34 2019-09-01 11:11:36
4 2019-09-01 18:10:45 2019-09-01 18:11:47", header = TRUE) %>%
transmute(ride = ride, start = ymd_hms(paste(Start, time)),
end = ymd_hms(paste(End, time.1)))
df %>%
mutate(difftime = difftime(end, lag(start), units = "hours"))
#> ride start end difftime
#> 1 1 2019-09-01 06:02:09 2019-09-01 06:03:09 NA hours
#> 2 2 2019-09-01 10:25:21 2019-09-01 10:26:22 4.4036111 hours
#> 3 3 2019-09-01 11:10:34 2019-09-01 11:11:36 0.7708333 hours
#> 4 4 2019-09-01 18:10:45 2019-09-01 18:11:47 7.0202778 hours
Created on 2021-11-14 by the reprex package (v2.0.1)
Upvotes: 0