phaser
phaser

Reputation: 625

Maintain date time stamp when calculating time intervals

I calculated the time intervals between date and time based on location and sensor. Here is some of my data:

datehour <- c("2016-03-24 20","2016-03-24 06","2016-03-24 18","2016-03-24 07","2016-03-24 16",
          "2016-03-24 09","2016-03-24 15","2016-03-24 09","2016-03-24 20","2016-03-24 05",
          "2016-03-25 21","2016-03-25 07","2016-03-25 19","2016-03-25 09","2016-03-25 12",
          "2016-03-25 07","2016-03-25 18","2016-03-25 08","2016-03-25 16","2016-03-25 09",
          "2016-03-26 20","2016-03-26 06","2016-03-26 18","2016-03-26 07","2016-03-26 16",
          "2016-03-26 09","2016-03-26 15","2016-03-26 09","2016-03-26 20","2016-03-26 05",
          "2016-03-27 21","2016-03-27 07","2016-03-27 19","2016-03-27 09","2016-03-27 12",
          "2016-03-27 07","2016-03-27 18","2016-03-27 08","2016-03-27 16","2016-03-27 09")
location <- c(1,1,2,2,3,3,4,4,"out","out",1,1,2,2,3,3,4,4,"out","out",
              1,1,2,2,3,3,4,4,"out","out",1,1,2,2,3,3,4,4,"out","out")
sensor <- c(1,16,1,16,1,16,1,16,1,16,1,16,1,16,1,16,1,16,1,16,
            1,16,1,16,1,16,1,16,1,16,1,16,1,16,1,16,1,16,1,16)
Temp <- c(35,34,92,42,21,47,37,42,63,12,35,34,92,42,21,47,37,42,63,12,
          35,34,92,42,21,47,37,42,63,12,35,34,92,42,21,47,37,42,63,12)
df <- data.frame(datehour,location,sensor,Temp)

I used the following code to calculate the time differences. However it does not maintain the correct date hour with each entry. See columns datehour1 and datehour2.

df$datehour <- as.POSIXct(df$datehour, format = "%Y-%m-%d %H")
final.time.df <- setDT(df)[order(datehour, location, sensor), .(difftime(datehour[-length(datehour)], 
                                                                            datehour[-1], unit = "hour"), 
                                                                            datehour1 = datehour[1], datehour2 = datehour[2]), .(location, sensor)]

I would like each time difference to have the two times used to calculate it to identify it. I would like the result to be the following:

    location sensor        V1           datehour1           datehour2
  out     16 -28 hours 2016-03-24 05:00:00 2016-03-25 09:00:00
    1     16 -25 hours 2016-03-24 06:00:00 2016-03-25 07:00:00
    2     16 -26 hours 2016-03-24 07:00:00 2016-03-25 09:00:00
    3     16 -22 hours 2016-03-24 09:00:00 2016-03-25 07:00:00
    4     16 -23 hours 2016-03-24 09:00:00 2016-03-25 08:00:00
    4      1 -27 hours 2016-03-24 15:00:00 2016-03-25 18:00:00
    3      1 -20 hours 2016-03-24 16:00:00 2016-03-25 12:00:00
    2      1 -25 hours 2016-03-24 18:00:00 2016-03-25 19:00:00
    1      1 -25 hours 2016-03-24 20:00:00 2016-03-25 21:00:00
  out      1 -20 hours 2016-03-24 20:00:00 2016-03-25 16:00:00

Upvotes: 0

Views: 606

Answers (1)

Adam Bethke
Adam Bethke

Reputation: 1038

Okay, so I'm not an expert by any means at data.tables solutions, and as a result I'm not quite sure how you're using the grouping statement to resolve the number of values down to 10.

That said, I think the answer to your question (if you haven't already solved this another way) lies in the difftime(datehour[-length(datehour)], datehour[-1], unit = "hour") chunk of code, but not in the fact that it's calculating the difference incorrectly, but in that it's preventing the grouping statement from resolving to the expected number of groups.

I tried separating the grouping from the time difference calculation, and was able to get to your expected output (obviously some formatting required):

final.time.df <- setDT(df)[order(datehour, location, sensor), .(datehour1 = datehour[1], datehour2 = datehour[2]), .(location, sensor)]
final.time.df$diff = final.time.df$datehour1 - final.time.df$datehour2

If I've missed the point, feel free to let me know and I'll delete the answer! I know it's not a particularly insightful answer, but it looks like this might do it, and I'm stuck on a problem myself right now, and wanted to try to help.

Upvotes: 1

Related Questions