eFF
eFF

Reputation: 277

Re-aggregating data - from coarse to finer temporal resolution

I would like to follow-up on a question answered by @r2evans: Interpolation in R: retrieving hourly values. I am trying to re-aggregate 3-hr data into hourly. If I use the following small reproducible dataset ("tair"):

 tair<-structure(list(Year = c(1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L, 1991L), 
                 Month = c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L), 
                 DoY = c(1L,1L, 1L, 1L, 1L, 1L, 1L, 2L), 
                 Hour = c(3L, 6L, 9L, 12L, 15L, 18L, 21L, 0L), 
                 Kobb = c(3.032776, 3.076996, 3.314209, 1.760345, 1.473724,1.295837, 2.72229, 3.209503), 
                 DateTime = structure(c(662698800,662709600, 662720400, 662731200, 662742000, 662752800, 662763600, 662774400), class = c("POSIXct", "POSIXt"), tzone = "UTC")), 
                 row.names = c(NA,8L), class = "data.frame")

in the following code:

library(zoo)
newdt <- seq.POSIXt(tair$DateTime[1], tail(tair$DateTime, n=1), by='1 hour');newdt
tair_hourly<-data.frame(datetime=newdt, Kobb=approx(tair$DateTime, tair$Kobb, newdt)$y)

It does the expected job, i.e. I successfully interpolate 3-hr data into hourly. Now, this works for variables such as temperature or radiation. However, for variables such as precipitation (stochastic), I would like to keep the variable constant (and perhaps divide it by 3) across the hourly aggregated data from the 3-hr resolution. I simply need hourly data, that's why all this.

Any ideas on how I can implement the above described small code?

Upvotes: 0

Views: 124

Answers (2)

r2evans
r2evans

Reputation: 160447

Two suggestions.

Base R

tair2_list <- lapply(seq_len(nrow(tair) - 1), function(ind) {
  times <- seq(tair$DateTime[ind], tair$DateTime[ind+1] - 1, by = "1 hour")
  data.frame(
    DateTime = times,
    NewKobb = rep(tair$Kobb[ind] / length(times), length(times)),
    # for reference only
    Kobb = c(tair$Kobb[1], rep(NA, length(times)-1))
  )
})

tair2 <- do.call(rbind, tair2_list)
tair2
#               DateTime   NewKobb     Kobb
# 1  1991-01-01 03:00:00 1.0109253 3.032776
# 2  1991-01-01 04:00:00 1.0109253       NA
# 3  1991-01-01 05:00:00 1.0109253       NA
# 4  1991-01-01 06:00:00 1.0256653 3.032776
# 5  1991-01-01 07:00:00 1.0256653       NA
# 6  1991-01-01 08:00:00 1.0256653       NA
# 7  1991-01-01 09:00:00 1.1047363 3.032776
# 8  1991-01-01 10:00:00 1.1047363       NA
# 9  1991-01-01 11:00:00 1.1047363       NA
# 10 1991-01-01 12:00:00 0.5867817 3.032776
# 11 1991-01-01 13:00:00 0.5867817       NA
# 12 1991-01-01 14:00:00 0.5867817       NA
# 13 1991-01-01 15:00:00 0.4912413 3.032776
# 14 1991-01-01 16:00:00 0.4912413       NA
# 15 1991-01-01 17:00:00 0.4912413       NA
# 16 1991-01-01 18:00:00 0.4319457 3.032776
# 17 1991-01-01 19:00:00 0.4319457       NA
# 18 1991-01-01 20:00:00 0.4319457       NA
# 19 1991-01-01 21:00:00 0.9074300 3.032776
# 20 1991-01-01 22:00:00 0.9074300       NA
# 21 1991-01-01 23:00:00 0.9074300       NA

The tair$DateTime[ind+1] - 1 is to ensure we do not inadvertently retain the last one in the new sequence.

tidyverse

library(dplyr)
library(purrr)
library(tidyr)
tair %>%
  mutate(DateTime2 = purrr::map2(DateTime, lead(DateTime - 1, default = last(DateTime)), 
                                 ~ tibble(DateTime2 = seq(.x, .y, by = "1 hour"))) ) %>%
  unnest(DateTime2) %>%
  group_by(DateTime) %>%
  mutate(
    NewKobb = Kobb / n(),
    Kobb = c(Kobb[1], rep(NA, n()-1))
  ) %>%
  ungroup()
# . + # A tibble: 22 x 8
#     Year Month   DoY  Hour  Kobb DateTime            DateTime2           NewKobb
#    <int> <int> <int> <int> <dbl> <dttm>              <dttm>                <dbl>
#  1  1991     1     1     3  3.03 1991-01-01 03:00:00 1991-01-01 03:00:00   1.01 
#  2  1991     1     1     3 NA    1991-01-01 03:00:00 1991-01-01 04:00:00   1.01 
#  3  1991     1     1     3 NA    1991-01-01 03:00:00 1991-01-01 05:00:00   1.01 
#  4  1991     1     1     6  3.08 1991-01-01 06:00:00 1991-01-01 06:00:00   1.03 
#  5  1991     1     1     6 NA    1991-01-01 06:00:00 1991-01-01 07:00:00   1.03 
#  6  1991     1     1     6 NA    1991-01-01 06:00:00 1991-01-01 08:00:00   1.03 
#  7  1991     1     1     9  3.31 1991-01-01 09:00:00 1991-01-01 09:00:00   1.10 
#  8  1991     1     1     9 NA    1991-01-01 09:00:00 1991-01-01 10:00:00   1.10 
#  9  1991     1     1     9 NA    1991-01-01 09:00:00 1991-01-01 11:00:00   1.10 
# 10  1991     1     1    12  1.76 1991-01-01 12:00:00 1991-01-01 12:00:00   0.587
# # ... with 12 more rows

(I feel like there is a better way to do this...)

Upvotes: 1

TichPi
TichPi

Reputation: 146

I think that is what you need and an easy way:

library(foqat)
tair=tair[,c(6,1:5)]  #move DateTime into first column.
tair2=trs(tair, "1 hour") #any temporal resolution, e.g., "1hour".
tair2
#              DateTime Year Month DoY Hour     Kobb
#1  1991-01-01 03:00:00 1991     1   1    3 3.032776
#2  1991-01-01 04:00:00   NA    NA  NA   NA       NA
#3  1991-01-01 05:00:00   NA    NA  NA   NA       NA
#4  1991-01-01 06:00:00 1991     1   1    6 3.076996
#5  1991-01-01 07:00:00   NA    NA  NA   NA       NA
#6  1991-01-01 08:00:00   NA    NA  NA   NA       NA
#7  1991-01-01 09:00:00 1991     1   1    9 3.314209
#8  1991-01-01 10:00:00   NA    NA  NA   NA       NA
#9  1991-01-01 11:00:00   NA    NA  NA   NA       NA
#10 1991-01-01 12:00:00 1991     1   1   12 1.760345
#11 1991-01-01 13:00:00   NA    NA  NA   NA       NA
#12 1991-01-01 14:00:00   NA    NA  NA   NA       NA
#13 1991-01-01 15:00:00 1991     1   1   15 1.473724
#14 1991-01-01 16:00:00   NA    NA  NA   NA       NA
#15 1991-01-01 17:00:00   NA    NA  NA   NA       NA
#16 1991-01-01 18:00:00 1991     1   1   18 1.295837
#17 1991-01-01 19:00:00   NA    NA  NA   NA       NA
#18 1991-01-01 20:00:00   NA    NA  NA   NA       NA
#19 1991-01-01 21:00:00 1991     1   1   21 2.722290
#20 1991-01-01 22:00:00   NA    NA  NA   NA       NA
#21 1991-01-01 23:00:00   NA    NA  NA   NA       NA
#22 1991-01-02 00:00:00 1991     1   2    0 3.209503

Upvotes: 2

Related Questions