Reputation: 35
I am trying to find the difference between two dates in hours, and for the time differences that occur over the span of more than one day I am getting really outrageous and incorrect numbers.
Here is an example of the data:
Observation Status DateTime
1 Active 2016-11-04 22:32:49
2 Inactive 2016-11-05 08:30:56
I am running this command:
getDiff <- function(x) {
difftime(shift(x, fill = NA, type = "lead"), x, units = "hours")
}
diff_result <- dataframe[, time.diff := ifelse(Status == "Active",
getDiff(DateTime), NA)]
And I get the following output:
Observation Status DateTime Time.diff
1 Active 2016-11-04 22:32:49 8757.884
2 Inactive 2016-11-05 08:30:56
This command works for all other differences that do not happen on separate days. The correct answer should be around 10 hours, not over 8000.
Also,
> class(DataFrame$DateTime)
[1] "POSIXct" "POSIXt"
Thank you in advance!
Upvotes: 0
Views: 1785
Reputation: 20095
It seems OP has not converted DateTime
format correctly. The 8757
hours are equivalent to about 1 year. Hence, it is possible DateTime
are wrongly formatted.
The result looks fine using OP's data at my end.
library(data.table)
getDiff <- function(x) {
difftime(shift(x, fill = NA, type = "lead"), x, units = "hours")
}
setDT(df)
diff_result <- df[, time.diff := ifelse(Status == "Active",
getDiff(DateTime), NA)]
diff_result
# Observation Status DateTime time.diff
# 1: 1 Active 2016-11-04 22:32:49 9.968611
# 2: 2 Inactive 2016-11-05 08:30:56 NA
#
Data:
df <- read.table(text =
"Observation Status DateTime
1 Active '2016-11-04 22:32:49'
2 Inactive '2016-11-05 08:30:56'",
header = TRUE, stringsAsFactors = FALSE)
df$DateTime = as.POSIXct(df$DateTime, format = "%Y-%m-%d %H:%M:%S")
Upvotes: 1