Reputation: 569
I have a time series of daily rainfall data from 1843-2016 but some of the days in the record are missing completely. I wish to infill dates for the days with missing data and give this a missing N/A code in the rainfall column. Is this possible?
My data takes the form:
Year Month Day Rainfall (mm)
1843 1 1 4.3
1843 1 2 0.0
1843 1 3 1.1
1843 1 5 0.0
Upvotes: 1
Views: 126
Reputation: 886948
We can try with dplyr/tidyr
. Create a sequence of 'Dates' from the first day of '1843' to last day of '2016', convert it to data.frame
, separate
it to 'Year', 'Month', and 'Day', then left_join
with the original dataset ('df1') so that the missing combinations will have NA in the "Rainfall" column.
Dates <- seq(as.Date("1843-01-01"), as.Date("2016-12-31"), by = "1 day")
library(tidyr)
library(dplyr)
data_frame(Dates) %>%
separate(., Dates, into = c("Year", "Month", "Day"), convert=TRUE) %>%
left_join(., df1, by = c("Year", "Month", "Day"))
Using a reproducible small example
df1 <- data.frame(Year = 1843, Month = 1, Day = c(1, 5, 7, 10), Rainfall= c(4.3, 0, 1.1, 0))
Dates <- seq(as.Date("1843-01-01"), as.Date("1843-01-10"), by = "1 day")
data_frame(Dates) %>%
separate(., Dates, into = c("Year", "Month", "Day"), convert=TRUE) %>%
left_join(., df1, by = c("Year", "Month", "Day"))
# Year Month Day Rainfall
# <dbl> <dbl> <dbl> <dbl>
#1 1843 1 1 4.3
#2 1843 1 2 NA
#3 1843 1 3 NA
#4 1843 1 4 NA
#5 1843 1 5 0.0
#6 1843 1 6 NA
#7 1843 1 7 1.1
#8 1843 1 8 NA
#9 1843 1 9 NA
#10 1843 1 10 0.0
Upvotes: 3