Reputation: 81
Essentially I'm looking to upsample to fill in missing hours between forecast times.
I have a dataset that looks like this:
case Regions forecastTime WindSpeed_low
1 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 09:00:00 35
2 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 12:00:00 25
3 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-03 03:00:00 25
4 27 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-05 09:00:00 15
5 27 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-05 16:00:00 00
WindSpeed_high poly_id
1 45 fea1-289
2 NA fea1-289
3 NA fea1-289
4 20 fea1-289
5 NA fea1-289
Each issued forecast has a case number, an associated region and forecast time.
My goal is to expand the forecast times for each case to include all hours between the times the forecast changed:
case Regions forecastTime WindSpeed_low
1 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 09:00:00 35
2 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 10:00:00 35
3 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 11:00:00 35
4 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 12:00:00 25
5 1 EAST COAST-CAPE ST FRANCIS AND SOUTH 2010-01-01 13:00:00 25
WindSpeed_high poly_id
1 45 fea1-289
2 45 fea1-289
3 45 fea1-289
4 NA fea1-289
5 NA fea1-289
Here the forecast is the same between 2010-01-01 09:00:00 and 2010-01-01 11:59:59, fd$WindSpeed_low == 35 and fd$WindSpeed_high == 45, however at 2010-01-01 12:00:00 the forecast changes to fd$WindSpeed_low == 25 and fd$WindSpeed_high == NA. I was thinking I could group each forecast by case, but I am stuck on how I should go about this expansion correctly. I am relatively new to R.
Upvotes: 0
Views: 124
Reputation: 388807
You may use complete
and fill
from tidyr
-
library(dplyr)
library(tidyr)
df %>%
group_by(case, Regions) %>%
complete(forecastTime = seq(min(forecastTime),max(forecastTime),by='hour')) %>%
fill(WindSpeed_low, poly_id) %>%
ungroup
Upvotes: 1