Reputation: 789
I am struggling on how to fill some NAs in a hourly temperature vector.
Over 21885 instances I have 472 NAs distributed randomly. The NAs should be filled in a logical way regarding the shape of the curve of Temperature throughout the day.
They are distributed in groups. There are 1 isolated, groups of 2, 3, 4 or more NAs in a row. If the group is small I would take the previous or the following values but when the group is large this won't work.
I think I an interpolation between the last known value and the following one is ideal but I have no clue how to do this as I am kind of new to R.
Thank you in advance for your time, any advice into what function or approach to this problem will be very much appreciated.
Sample:
mydate <- c("2017-03-23 09:00:00 CET","2017-03-23 10:00:00 CET", "2017-03-23 11:00:00 CET" ,"2017-03-23 12:00:00 CET" ,"2017-03-23 13:00:00 CET" ,"2017-03-23 14:00:00 CET" ,"2017-03-23 15:00:00 CET", "2017-03-23 16:00:00 CET",
"2017-03-23 17:00:00 CET", "2017-03-23 18:00:00 CET", "2017-03-23 19:00:00 CET" ,"2017-03-23 20:00:00 CET" ,"2017-03-23 21:00:00 CET" ,"2017-03-23 22:00:00 CET", "2017-03-23 23:00:00 CET" ,"2017-03-24 00:00:00 CET",
"2017-03-24 01:00:00 CET", "2017-03-24 02:00:00 CET" ,"2017-03-24 03:00:00 CET" ,"2017-03-24 04:00:00 CET")
mytemp <- c(12, 13, 13, 15, 16, 15, NA, NA, NA, NA ,NA, NA, NA, NA, NA, NA, 10, 10, 9, 9)
mydataframe <- as.data.frame(cbind(mydate, mytemp))
CSV with all instances: https://wetransfer.com/downloads/a1806d8b04013e3ea4acee9bff746b1d20170803073703/8e6e4c
Upvotes: 0
Views: 418
Reputation: 1169
This function from the zoo package seems to do the job:
zoo::na.fill(mytemp, fill = "extend")
[1] 12.00000 13.00000 13.00000 15.00000 16.00000 15.00000 14.54545
[8] 14.09091 13.63636 13.18182 12.72727 12.27273 11.81818 11.36364
[15] 10.90909 10.45455 10.00000 10.00000 9.00000 9.00000
Edit: this question and it's answer deal with a more general situation where the time points aren't equidistant, using zoo::na.approx
. One difference is that na.approx
does not extend to the leading and trailing NAs, while na.fill
does (when fill = "extend"
).
Upvotes: 1