Reputation: 1451
My dataset looks like the following (let's call it "a"):
date value
2013-01-01 12.2
2013-01-02 NA
2013-01-03 NA
2013-01-04 16.8
2013-01-05 10.1
2013-01-06 NA
2013-01-07 12.0
I would like to replace the NA
by the mean of the closest surroundings values (the previous and the next values in the series).
I tried the following but I am not convinced by the output...
miss.val = which(is.na(a$value))
library(zoo)
z = zoo(a$value, a$date)
z.corr = na.approx(z)
z.corr[(miss.val - 1):(miss.val + 1), ]
Upvotes: 6
Views: 3786
Reputation: 7730
You can do exactly this in 1 line of code with the Moving Average na.ma function of the imputeTS package
library(imputeTS)
na_ma(yourData, k = 1)
This replaces the missing values with the mean of the closest surroundings values. You can even additionally set parameters.
na_ma(yourData, k =2, weighting = "simple")
In this case the algorithm would take the next 2 values in each direction. You can also choose different weighting of the values(you might want that values closer have more influence)
Upvotes: 2
Reputation: 68839
Using na.locf
(Last Observation Carried Forward) from package zoo
:
R> library("zoo")
R> x <- c(12.2, NA, NA, 16.8, 10.1, NA, 12.0)
R> (na.locf(x) + rev(na.locf(rev(x))))/2
[1] 12.20 14.50 14.50 16.80 10.10 11.05 12.00
(does not work if first or last element of x
is NA
)
Upvotes: 6