Reputation: 41
I have a data.frame with weather data. The observations have a resolution of 15 minutes and each timestamp is in the following format: %Y-%m-%d %H:%M:%S
.
I am working on a Shiny application where I want to allow the user to plot data from hourly to daily temporal analysis. To achieve that, I have included a slider with values from 1 to 24 to be used in the following function:
# hourly aggregate
apply.hourly <- function(x, FUN,...) {
ep <- endpoints(x, 'hours', input$slider)
period.apply(x, ep, FUN, ...)
}
However, that doesn't work properly so I checked the endpoints returned by the endpoint function. For some reason, when input$slider
is larger than 1, the function returns results that I cannot interpret. Here is an example:
library(xts)
# Creating a sample timeserie
date_time <- seq.POSIXt(from = as.POSIXct("2022-07-10 00:00:00"),
to = as.POSIXct("2022-07-13 23:45:00"),
by = "hour")
head(date_time)
#> [1] "2022-07-10 00:00:00 CEST" "2022-07-10 01:00:00 CEST"
#> [3] "2022-07-10 02:00:00 CEST" "2022-07-10 03:00:00 CEST"
#> [5] "2022-07-10 04:00:00 CEST" "2022-07-10 05:00:00 CEST"
# Calculating the endpoints for 1 hour
ep1 <- endpoints(date_time, on = "hours", k = 1)
ep1
#> [1] 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23
#> [25] 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47
#> [49] 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71
#> [73] 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95
#> [97] 96
# Calculating the endpoints for 2 hours
ep2 <- endpoints(date_time, on = "hours", k = 2)
ep2
#> [1] 0 1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45
#> [25] 47 49 51 53 55 57 59 61 63 65 67 69 71 73 75 77 79 81 83 85 87 89 91 93
#> [49] 95 96
# Calculating the endpoints for 3 hours
ep3 <- endpoints(date_time, on = "hours", k = 3)
ep3
#> [1] 0 3 6 9 12 15 18 21 24 27 30 33 36 39 42 45 48 51 54 57 60 63 66 69
#> [25] 72 75 78 81 84 87 90 93 96
# Calculating the endpoints for 4 hours
ep4 <- endpoints(date_time, on = "hours", k = 4)
ep4
#> [1] 0 3 7 11 15 19 23 27 31 35 39 43 47 51 55 59 63 67 71 75 79 83 87 91
#> [25] 95 96
Created on 2022-07-26 by the reprex package (v2.0.1)
My question is the following:
According to ?endpoints
, endpoints
returns a numeric vector corresponding to the last observation in each period specified by on
.
So how come the first time that the 2 hour period end occurs in the 1st observation (still only 1 hour have passed) with k = 2
?
Similarly, with k = 3
and k = 4
, how come the first endpoint for both cases is at the 3rd observation?
Upvotes: 1
Views: 101
Reputation: 176648
This happens because endpoints()
always returns the k
th hour from the origin/epoch (midnight 1970-01-01). In your example, endpoints()
calculates the first 'k'th hour of the day as:
Another thing to keep in mind: the endpoints()
result always starts with 0
and ends with length(x)
(i.e. c(0, ..., 10)
for a 10-element vector), regardless of where the end locations are.
Upvotes: 1