rmuc8
rmuc8

Reputation: 2989

Calculate time difference between two events (given date and time) in R

I'm currently struggling with a beginner's issue regarding the calculation of a time difference between two events.

I want to take a column consisting of date and time (both values in one column) into consideration and calculate a time difference with the value of the previous/next row with the same ID (A or B in this example).

ID = c("A", "A", "B", "B")
time = c("08.09.2014 10:34","12.09.2014 09:33","13.08.2014 15:52","11.09.2014 02:30")
d = data.frame(ID,time)

My desired output is in the format Hours:Minutes

time difference = c("94:59","94:59","682:38","682:38")

The format Days:Hours:Minutes or anything similar would also work, as long as it could be conveniently implemented. I am flexible regarding the format of the output, the above is just an idea that crossed my mind.

For each single ID, I always have two rows (in the example 2xA and 2xB). I don't have a convincing idea how to avoid the repition of the difference.

I've tried some examples before, which I found on stackoverflow. Most of them used POSIXt and strptime. However, I didn't manage to apply those ideas to my data set.

Upvotes: 2

Views: 4089

Answers (3)

rmuc8
rmuc8

Reputation: 2989

similar to the attempts of David and hrmbrmstr, I found that this solution using difftime works

I use a rowshift script I found on stackoverflow

rowShift <- function(x, shiftLen = 1L) {
   r <- (1L + shiftLen):(length(x) + shiftLen)
   r[r<1] <- NA
   return(x[r])
}
d$time.c <- as.POSIXct(d$time, format = "%d.%m.%Y %H:%M")
d$time.prev <- rowShift(d$time.c,-1)
d$diff <- difftime(d$time.c,d$time.prev, units="hours")

Every other row of d$diff has positive/negative values in the results. I do remove all the rows with negative values and have the difference between the first and the last time for every ID.

Upvotes: 0

David Arenburg
David Arenburg

Reputation: 92300

Here's my attempt using dplyr

library(dplyr)
d %>%
  mutate(time = as.POSIXct(time, format = "%d.%m.%Y %H:%M")) %>%
  group_by(ID) %>%
  mutate(diff = paste0(gsub("[.].*", "", diff(time)*24), ":",
                       round(as.numeric(gsub(".*[.]", ".", diff(time)*24))*60)))


# Source: local data frame [4 x 3]
# Groups: ID
# 
#   ID                time   diff
# 1  A 2014-09-08 10:34:00  94:59
# 2  A 2014-09-12 09:33:00  94:59
# 3  B 2014-08-13 15:52:00 682:38
# 4  B 2014-09-11 02:30:00 682:38

Upvotes: 2

hrbrmstr
hrbrmstr

Reputation: 78832

A very (to me) hack-ish base solution:

ID <- c("A", "A", "B", "B")
time <- c("08.09.2014 10:34", "12.09.2014 09:33", "13.08.2014 15:52","11.09.2014 02:30")
d <- data.frame(ID, time)

d$time <- as.POSIXct(d$time, format="%d.%m.%Y %H:%M")

unlist(unname(lapply(split(d, d$ID), function(d) {
  sapply(abs(diff(c(d$time[2], d$time))), function(x) {
    sprintf("%s:%s", round(((x*24)%/%1)), round(((x*24)%%1 *60)))
  }) 
})))

## [1] "94:59"  "94:59"  "682:38" "682:38"

I have to believe this function exists somewhere already, tho.

Upvotes: 1

Related Questions