sebdalgarno
sebdalgarno

Reputation: 3197

fill sequence of datetimes

I have a data.frame consisting of IDs, DateTimes and Values. For each ID I want to expand the DateTimes so that there is one every hour between the min and max. The ID and Value columns should be filled with their respective values (i.e. not NA). There will never be case where there are multiple unique Value values for each ID. For example the new data.frame should have 4 rows with ID == 1 and 5 rows with ID == 2 (9 rows total). What is simplest way to do this?

I commonly work with dplyr so will accept both base or dplyr/tidyverse methods

library(dplyr)
library(lubridate)

dt <- Sys.time() %>% floor_date('hours')
df <- data.frame(ID = c(1, 1, 2, 2), 
                 DateTime = c(dt, dt + hours(3), dt + hours(6), dt + hours(9)),
                 Value = c(3, 3, 4, 4))

the expected output is:

output <- data.frame(ID = c(1, 1, 1, 1, 2, 2, 2, 2), 
                     DateTime = c(dt, dt + hours(1), dt + hours(2), dt + hours(3), 
                                  dt + hours(6), dt + hours(7), dt + hours(8), dt + hours(9)),
                     Value = c(3, 3, 3, 3, 4, 4, 4, 4))

  ID            DateTime Value
1  1 2018-03-27 19:00:00     3
2  1 2018-03-27 20:00:00     3
3  1 2018-03-27 21:00:00     3
4  1 2018-03-27 22:00:00     3
5  2 2018-03-28 01:00:00     4
6  2 2018-03-28 02:00:00     4
7  2 2018-03-28 03:00:00     4
8  2 2018-03-28 04:00:00     4

Upvotes: 5

Views: 851

Answers (1)

akrun
akrun

Reputation: 886948

We can use complete from tidyr

library(tidyverse)
df %>% 
   group_by(ID, Value) %>% 
   complete(DateTime = seq(min(DateTime), max(DateTime), by = "1 hour")) %>%
   select(names(df)) 

Upvotes: 5

Related Questions