Choc_waffles
Choc_waffles

Reputation: 537

Create a vector of date and time by the hour

I want to create a vector of hourly interval (specific start and end time) across the pre-specified date (date range).

without the date range, i've managed to

seq(as.POSIXct("2020-02-20 09:00:00"), as.POSIXct("2020-02-20 18:00:00"), by="hour")

which generates

2020-02-20 09:00:00
2020-02-20 10:00:00
2020-02-20 11:00:00
2020-02-20 12:00:00
2020-02-20 13:00:00
2020-02-20 14:00:00
2020-02-20 15:00:00
2020-02-20 16:00:00
2020-02-20 17:00:00
2020-02-20 18:00:00

I would like to utilize these variables and create a version which generates the same as above, but for the given date ranges.

min_date = as.Date('2020-02-03')
max_date = as.Date('2020-02-07')

min_hour = "09:00:00"
max_hour = "18:00:00"

#maybe the following code to convert the above into sequences aboves?
as.POSIXct(trimws(paste0(min_date, " ", min_hour)))
as.POSIXct(trimws(paste0(min_date, " ", max_hour )))
as.POSIXct(trimws(paste0(max_date, " ", min_hour)))
as.POSIXct(trimws(paste0(max_date, " ", max_hour )))

Upvotes: 3

Views: 1265

Answers (2)

jay.sf
jay.sf

Reputation: 72828

This solution gives you a vector of date-times.

It might be a little tedious to work with times, because often they are coerced into numerics (i.e. seconds since "1970-01-01"). To prevent this you could work with the digits as numerics, do an expand.grid which rows you convert to "POSIXct" using strptime. This is also fast and gives a vector of date-times.

mkDateTime <- function(min.date, max.date, min.hour, max.hour, tz="GMT") {
  n <- as.numeric(
    gsub("\\D", "", mapply(as.character, list(min.date, max.date, min.hour, max.hour))))
  e <- expand.grid(n[1]:n[2], sprintf("%02d0000", (n[3]/1e4):(n[4]/1e4)))
  unname(strptime(apply(e[order(e$Var1), ], 1, paste, collapse=" "), "%Y%m%d %H%M%S", tz=tz))
}

mkDateTime(min_date, max_date, min_hour, max_hour)
# [1] "2020-02-03 09:00:00 GMT" "2020-02-03 10:00:00 GMT" "2020-02-03 11:00:00 GMT"
# [4] "2020-02-03 12:00:00 GMT" "2020-02-03 13:00:00 GMT" "2020-02-03 14:00:00 GMT"
# [7] "2020-02-03 15:00:00 GMT" "2020-02-03 16:00:00 GMT" "2020-02-03 17:00:00 GMT"
# [10] "2020-02-03 18:00:00 GMT" "2020-02-04 09:00:00 GMT" "2020-02-04 10:00:00 GMT"
# [13] "2020-02-04 11:00:00 GMT" "2020-02-04 12:00:00 GMT" "2020-02-04 13:00:00 GMT"
# [16] "2020-02-04 14:00:00 GMT" "2020-02-04 15:00:00 GMT" "2020-02-04 16:00:00 GMT"
# [19] "2020-02-04 17:00:00 GMT" "2020-02-04 18:00:00 GMT" "2020-02-05 09:00:00 GMT"
# [22] "2020-02-05 10:00:00 GMT" "2020-02-05 11:00:00 GMT" "2020-02-05 12:00:00 GMT"
# [25] "2020-02-05 13:00:00 GMT" "2020-02-05 14:00:00 GMT" "2020-02-05 15:00:00 GMT"
# [28] "2020-02-05 16:00:00 GMT" "2020-02-05 17:00:00 GMT" "2020-02-05 18:00:00 GMT"
# [31] "2020-02-06 09:00:00 GMT" "2020-02-06 10:00:00 GMT" "2020-02-06 11:00:00 GMT"
# [34] "2020-02-06 12:00:00 GMT" "2020-02-06 13:00:00 GMT" "2020-02-06 14:00:00 GMT"
# [37] "2020-02-06 15:00:00 GMT" "2020-02-06 16:00:00 GMT" "2020-02-06 17:00:00 GMT"
# [40] "2020-02-06 18:00:00 GMT" "2020-02-07 09:00:00 GMT" "2020-02-07 10:00:00 GMT"
# [43] "2020-02-07 11:00:00 GMT" "2020-02-07 12:00:00 GMT" "2020-02-07 13:00:00 GMT"
# [46] "2020-02-07 14:00:00 GMT" "2020-02-07 15:00:00 GMT" "2020-02-07 16:00:00 GMT"
# [49] "2020-02-07 17:00:00 GMT" "2020-02-07 18:00:00 GMT"

You could also create a sequence from minimum date and time to maximum date and time and delete those times not in the defined time range. I use data.table here for %inrange%.

library(data.table)
mkDateTime2 <- function(min.date, max.date, min.hour, max.hour) {
  v <- seq(as.POSIXct(paste(min.date, min.hour)),
           as.POSIXct(paste(max.date, max.hour)), "hour")
  v[as.double(gsub("\\D", "", substr(as.character(v), 12, 13))) %inrange% 
      as.numeric(substr(c(min.hour, max.hour), 1, 2))]
}

mkDateTime2(min_date, max_date, min_hour, max_hour)
# [1] "2020-02-03 09:00:00 CET" "2020-02-03 10:00:00 CET" "2020-02-03 11:00:00 CET"
# [4] "2020-02-03 12:00:00 CET" "2020-02-03 13:00:00 CET" "2020-02-03 14:00:00 CET"
# [7] "2020-02-03 15:00:00 CET" "2020-02-03 16:00:00 CET" "2020-02-03 17:00:00 CET"
# [10] "2020-02-03 18:00:00 CET" "2020-02-04 09:00:00 CET" "2020-02-04 10:00:00 CET"
# [13] "2020-02-04 11:00:00 CET" "2020-02-04 12:00:00 CET" "2020-02-04 13:00:00 CET"
# [16] "2020-02-04 14:00:00 CET" "2020-02-04 15:00:00 CET" "2020-02-04 16:00:00 CET"
# [19] "2020-02-04 17:00:00 CET" "2020-02-04 18:00:00 CET" "2020-02-05 09:00:00 CET"
# [22] "2020-02-05 10:00:00 CET" "2020-02-05 11:00:00 CET" "2020-02-05 12:00:00 CET"
# [25] "2020-02-05 13:00:00 CET" "2020-02-05 14:00:00 CET" "2020-02-05 15:00:00 CET"
# [28] "2020-02-05 16:00:00 CET" "2020-02-05 17:00:00 CET" "2020-02-05 18:00:00 CET"
# [31] "2020-02-06 09:00:00 CET" "2020-02-06 10:00:00 CET" "2020-02-06 11:00:00 CET"
# [34] "2020-02-06 12:00:00 CET" "2020-02-06 13:00:00 CET" "2020-02-06 14:00:00 CET"
# [37] "2020-02-06 15:00:00 CET" "2020-02-06 16:00:00 CET" "2020-02-06 17:00:00 CET"
# [40] "2020-02-06 18:00:00 CET" "2020-02-07 09:00:00 CET" "2020-02-07 10:00:00 CET"
# [43] "2020-02-07 11:00:00 CET" "2020-02-07 12:00:00 CET" "2020-02-07 13:00:00 CET"
# [46] "2020-02-07 14:00:00 CET" "2020-02-07 15:00:00 CET" "2020-02-07 16:00:00 CET"
# [49] "2020-02-07 17:00:00 CET" "2020-02-07 18:00:00 CET"

It might be easier to understand but it's also slower.

Upvotes: 1

r2evans
r2evans

Reputation: 160447

lapply(seq(as.Date('2020-02-03'), as.Date('2020-02-07'), by = "day"),
       function(dy) seq(as.POSIXct(paste(dy, "09:00:00")), as.POSIXct(paste(dy, "18:00:00")), by="hour") )
# [[1]]
#  [1] "2020-02-03 09:00:00 PST" "2020-02-03 10:00:00 PST"
#  [3] "2020-02-03 11:00:00 PST" "2020-02-03 12:00:00 PST"
#  [5] "2020-02-03 13:00:00 PST" "2020-02-03 14:00:00 PST"
#  [7] "2020-02-03 15:00:00 PST" "2020-02-03 16:00:00 PST"
#  [9] "2020-02-03 17:00:00 PST" "2020-02-03 18:00:00 PST"
# [[2]]
#  [1] "2020-02-04 09:00:00 PST" "2020-02-04 10:00:00 PST"
#  [3] "2020-02-04 11:00:00 PST" "2020-02-04 12:00:00 PST"
#  [5] "2020-02-04 13:00:00 PST" "2020-02-04 14:00:00 PST"
#  [7] "2020-02-04 15:00:00 PST" "2020-02-04 16:00:00 PST"
#  [9] "2020-02-04 17:00:00 PST" "2020-02-04 18:00:00 PST"
# [[3]]
#  [1] "2020-02-05 09:00:00 PST" "2020-02-05 10:00:00 PST"
#  [3] "2020-02-05 11:00:00 PST" "2020-02-05 12:00:00 PST"
#  [5] "2020-02-05 13:00:00 PST" "2020-02-05 14:00:00 PST"
#  [7] "2020-02-05 15:00:00 PST" "2020-02-05 16:00:00 PST"
#  [9] "2020-02-05 17:00:00 PST" "2020-02-05 18:00:00 PST"
# [[4]]
#  [1] "2020-02-06 09:00:00 PST" "2020-02-06 10:00:00 PST"
#  [3] "2020-02-06 11:00:00 PST" "2020-02-06 12:00:00 PST"
#  [5] "2020-02-06 13:00:00 PST" "2020-02-06 14:00:00 PST"
#  [7] "2020-02-06 15:00:00 PST" "2020-02-06 16:00:00 PST"
#  [9] "2020-02-06 17:00:00 PST" "2020-02-06 18:00:00 PST"
# [[5]]
#  [1] "2020-02-07 09:00:00 PST" "2020-02-07 10:00:00 PST"
#  [3] "2020-02-07 11:00:00 PST" "2020-02-07 12:00:00 PST"
#  [5] "2020-02-07 13:00:00 PST" "2020-02-07 14:00:00 PST"
#  [7] "2020-02-07 15:00:00 PST" "2020-02-07 16:00:00 PST"
#  [9] "2020-02-07 17:00:00 PST" "2020-02-07 18:00:00 PST"

Upvotes: 3

Related Questions