Reputation: 3197
I have a data.frame consisting of IDs, DateTimes and Values. For each ID I want to expand the DateTimes so that there is one every hour between the min and max. The ID and Value columns should be filled with their respective values (i.e. not NA). There will never be case where there are multiple unique Value values for each ID. For example the new data.frame should have 4 rows with ID == 1 and 5 rows with ID == 2 (9 rows total). What is simplest way to do this?
I commonly work with dplyr so will accept both base or dplyr/tidyverse methods
library(dplyr)
library(lubridate)
dt <- Sys.time() %>% floor_date('hours')
df <- data.frame(ID = c(1, 1, 2, 2),
DateTime = c(dt, dt + hours(3), dt + hours(6), dt + hours(9)),
Value = c(3, 3, 4, 4))
the expected output is:
output <- data.frame(ID = c(1, 1, 1, 1, 2, 2, 2, 2),
DateTime = c(dt, dt + hours(1), dt + hours(2), dt + hours(3),
dt + hours(6), dt + hours(7), dt + hours(8), dt + hours(9)),
Value = c(3, 3, 3, 3, 4, 4, 4, 4))
ID DateTime Value
1 1 2018-03-27 19:00:00 3
2 1 2018-03-27 20:00:00 3
3 1 2018-03-27 21:00:00 3
4 1 2018-03-27 22:00:00 3
5 2 2018-03-28 01:00:00 4
6 2 2018-03-28 02:00:00 4
7 2 2018-03-28 03:00:00 4
8 2 2018-03-28 04:00:00 4
Upvotes: 5
Views: 851
Reputation: 886948
We can use complete
from tidyr
library(tidyverse)
df %>%
group_by(ID, Value) %>%
complete(DateTime = seq(min(DateTime), max(DateTime), by = "1 hour")) %>%
select(names(df))
Upvotes: 5