Reputation: 1441
I am looking for an efficient measure to calculate the differential hours between times for a given date. My data looks as follows:
Time_formatted Date Hours_diff
1/20/2016 19:19 1/20/2016 0:46
1/20/2016 18:33 1/20/2016 2:43
1/20/2016 15:50 1/20/2016 1:28
1/20/2016 14:22 1/20/2016 1:50
1/20/2016 12:32 1/20/2016 4:52
1/20/2016 7:40 1/20/2016 0
1/19/2016 23:23 1/19/2016
1/19/2016 23:06 1/19/2016
1/19/2016 22:37 1/19/2016
1/19/2016 21:56 1/19/2016
1/19/2016 21:05 1/19/2016
1/19/2016 17:53 1/19/2016
1/19/2016 17:39 1/19/2016
1/19/2016 17:01 1/19/2016
1/19/2016 15:31 1/19/2016
My first thought was to use the ave
function in R:
ave(data$Time_formatted, data$Date, FUN=difftime)
but an obvious error, Error in as.POSIXct(time2) : argument "time2" is missing, with no default
occurs by using this function. A loop will be inefficient as the data is quite large.
Any ideas to approach this kind of problem?
Upvotes: 0
Views: 171
Reputation: 3728
Steps and output below:
> df<-read.csv("data.txt",header=T,stringsAsFactors=F)
> df$Time_formatted<-strptime(a$Time_formatted,"%m/%d/%Y %H:%M")
> df$Date <-strptime(a$Date,"%m/%d/%Y")
> df<-df[order(df$Time_formatted,decreasing=T),] #Make sure it is ordered
> df
Time_formatted Date
1 2016-01-20 19:19:00 2016-01-20
2 2016-01-20 18:33:00 2016-01-20
3 2016-01-20 15:50:00 2016-01-20
4 2016-01-20 14:22:00 2016-01-20
5 2016-01-20 12:32:00 2016-01-20
6 2016-01-20 07:40:00 2016-01-20
7 2016-01-19 23:23:00 2016-01-19
8 2016-01-19 23:06:00 2016-01-19
9 2016-01-19 22:37:00 2016-01-19
10 2016-01-19 21:56:00 2016-01-19
11 2016-01-19 21:05:00 2016-01-19
12 2016-01-19 17:53:00 2016-01-19
13 2016-01-19 17:39:00 2016-01-19
14 2016-01-19 17:01:00 2016-01-19
15 2016-01-19 15:31:00 2016-01-19
> df$Hours_diff<-c(-diff(df$Time_formatted),0) # calculate time difference
> df[which(diff(df$Date)!=0),"Hours_diff"]<-0 # set the last timepoint in day to 0
> df$Hours_diff<-ifelse(df$Hours_diff>0,paste(floor(df$Hours_diff/60),df$Hours_diff%%60,sep=":"),0)
> df
Time_formatted Date Hours_diff
1 2016-01-20 19:19:00 2016-01-20 0:46
2 2016-01-20 18:33:00 2016-01-20 2:43
3 2016-01-20 15:50:00 2016-01-20 1:28
4 2016-01-20 14:22:00 2016-01-20 1:50
5 2016-01-20 12:32:00 2016-01-20 4:52
6 2016-01-20 07:40:00 2016-01-20 0
7 2016-01-19 23:23:00 2016-01-19 0:17
8 2016-01-19 23:06:00 2016-01-19 0:29
9 2016-01-19 22:37:00 2016-01-19 0:41
10 2016-01-19 21:56:00 2016-01-19 0:51
11 2016-01-19 21:05:00 2016-01-19 3:12
12 2016-01-19 17:53:00 2016-01-19 0:14
13 2016-01-19 17:39:00 2016-01-19 0:38
14 2016-01-19 17:01:00 2016-01-19 1:30
15 2016-01-19 15:31:00 2016-01-19 0
Upvotes: 1
Reputation: 15937
First, I prepare the data. I make sure that date and time are stored in the proper format and sort the data frame by the column Time_formatted
:
# convert times to POSIXct, dates to Date
data$Time_formatted <- as.POSIXct(data$Time_formatted, format = "%m/%d/%Y %H:%M")
data$Date <- as.Date(data$Date, format = "%m/%d/%Y")
# sort
data <- data[order(data$Time_formatted), ]
Then I use tapply()
together with diff()
to calculate the differences in minutes. Note that I add an extra zero in order to account for the first time of each day, where the time difference is undefined:
my_diff <- function(x, ...) {
c(0, diff(x, ...))
}
diffs <- unlist(tapply(data$Time_formatted, data$Date, my_diff))
The last step is to convert the time differences from minutes to %H:%M
as follows (see this answer for the use of formatC()
):
mins2hm <- function(min) {
h <- min %/% 60
m <- min %% 60
hm <- paste(h, formatC(m, width = 2, flag = 0), sep = ":")
}
data$diffs <- mins2hm(diffs)
data
## Time_formatted Date diffs
## 15 2016-01-19 15:31:00 2016-01-19 0:00
## 14 2016-01-19 17:01:00 2016-01-19 1:30
## 13 2016-01-19 17:39:00 2016-01-19 0:38
## 12 2016-01-19 17:53:00 2016-01-19 0:14
## 11 2016-01-19 21:05:00 2016-01-19 3:12
## 10 2016-01-19 21:56:00 2016-01-19 0:51
## 9 2016-01-19 22:37:00 2016-01-19 0:41
## 8 2016-01-19 23:06:00 2016-01-19 0:29
## 7 2016-01-19 23:23:00 2016-01-19 0:17
## 6 2016-01-20 07:40:00 2016-01-20 0:00
## 5 2016-01-20 12:32:00 2016-01-20 4:52
## 4 2016-01-20 14:22:00 2016-01-20 1:50
## 3 2016-01-20 15:50:00 2016-01-20 1:28
## 2 2016-01-20 18:33:00 2016-01-20 2:43
## 1 2016-01-20 19:19:00 2016-01-20 0:46
Upvotes: 1