Diomides
Diomides

Reputation: 41

R apply.hourly endpoints xts

I have a data.frame with weather data. The observations have a resolution of 15 minutes and each timestamp is in the following format: %Y-%m-%d %H:%M:%S.

I am working on a Shiny application where I want to allow the user to plot data from hourly to daily temporal analysis. To achieve that, I have included a slider with values from 1 to 24 to be used in the following function:

# hourly aggregate 
apply.hourly <- function(x, FUN,...) {
  ep <- endpoints(x, 'hours', input$slider)
  period.apply(x, ep, FUN, ...)
}

However, that doesn't work properly so I checked the endpoints returned by the endpoint function. For some reason, when input$slider is larger than 1, the function returns results that I cannot interpret. Here is an example:

library(xts)

# Creating a sample timeserie     
date_time <- seq.POSIXt(from = as.POSIXct("2022-07-10 00:00:00"),
                        to = as.POSIXct("2022-07-13 23:45:00"),
                        by = "hour")

head(date_time)
#> [1] "2022-07-10 00:00:00 CEST" "2022-07-10 01:00:00 CEST"
#> [3] "2022-07-10 02:00:00 CEST" "2022-07-10 03:00:00 CEST"
#> [5] "2022-07-10 04:00:00 CEST" "2022-07-10 05:00:00 CEST"

# Calculating the endpoints for 1 hour 
ep1 <- endpoints(date_time, on = "hours", k = 1)

ep1
#> [1]  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 
#> [25] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 
#> [49] 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 
#> [73] 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 
#> [97] 96

# Calculating the endpoints for 2 hours
ep2 <- endpoints(date_time, on = "hours", k = 2)

ep2
#> [1]  0  1  3  5  7  9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 
#> [25] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93 
#> [49] 95 96

# Calculating the endpoints for 3 hours 
ep3 <- endpoints(date_time, on = "hours", k = 3)

ep3
#> [1]  0  3  6  9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69 
#> [25] 72 75 78 81 84 87 90 93 96

# Calculating the endpoints for 4 hours 
ep4 <- endpoints(date_time, on = "hours", k = 4)

ep4
#> [1]  0  3  7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91 
#> [25] 95 96

Created on 2022-07-26 by the reprex package (v2.0.1)

My question is the following:

According to ?endpoints, endpoints returns a numeric vector corresponding to the last observation in each period specified by on.

So how come the first time that the 2 hour period end occurs in the 1st observation (still only 1 hour have passed) with k = 2?

Similarly, with k = 3 and k = 4, how come the first endpoint for both cases is at the 3rd observation?

Upvotes: 1

Views: 101

Answers (1)

Joshua Ulrich
Joshua Ulrich

Reputation: 176648

This happens because endpoints() always returns the kth hour from the origin/epoch (midnight 1970-01-01). In your example, endpoints() calculates the first 'k'th hour of the day as:

  1. 2022-07-10 01:00:00 for k = 1
  2. 2022-07-10 02:00:00 for k = 2
  3. 2022-07-10 03:00:00 for k = 3
  4. 2022-07-10 04:00:00 for k = 4

Another thing to keep in mind: the endpoints() result always starts with 0 and ends with length(x) (i.e. c(0, ..., 10) for a 10-element vector), regardless of where the end locations are.

Upvotes: 1

Related Questions