Reputation: 320
I am trying to find the average of the date of the row above and of the row below. However, lag(Date) and lead(Date) are producing vectors which mean() then refuses to work with returning NA instead (figuring that out took me faaaaaar too long).
df <- data_frame(Date = as.Date(c("2020-01-01", NA, "2020-12-12")))
df <- mutate(df, Date = replace_na(Date, mean(c(lag(Date),lead(Date)))))
This gives ("2020-01-01", NA, "2020-12-12") whereas what I want is ("2020-01-01, "2020-06-06", "2020-12-12")
So how do I access the previous and next row value for Date so that I can generate an average?
Upvotes: 1
Views: 622
Reputation: 269674
data_frame
is deprecated so we have replaced that with data.frame
. tibble
would be an alternative but involves an additional dependency. Use na.approx
in zoo and convert it back to Date
class as it produces a numeric output. This also works to interpolate multiple NAs in a row. That would not work if we used lead
and lag
. If there are NAs at the ends this leaves them as is or we could use different arguments to na.approx
to fill those in too; however, in the example data there are no such NAs so we leave it as shown.
(Note that the correct halfway point is as shown below rather than as shown in the question. There are 173 days between 2020-01-01 and 2020-06-22 and also between 2020-06-22 and 2020-12-12.)
library(dplyr)
library(zoo)
df <- data.frame(Date = as.Date(c("2020-01-01", NA, "2020-12-12"))) # test input
mutate(df, Date = as.Date(na.approx(Date, na.rm = FALSE)))
giving:
Date
1 2020-01-01
2 2020-06-22
3 2020-12-12
Upvotes: 2
Reputation: 5788
Base R one liner solving your sample data (suspecting you have multiple instances of NA -- see solution below):
df$Date <- ifelse(is.na(df$Date), mean(df$Date, na.rm = TRUE), df$Date)
Interpolating dates:
df$Date <- as.Date(ifelse(
is.na(df$Date),
approx(as.numeric(df$Date), method = "linear", n = nrow(df))$y[which(is.na(df$Date))]
,
df$Date
),
origin = as.Date("1970-01-01", "%Y-%m-%d"),
"%d-%m-%Y")
Data used:
df <- data.frame(Date = as.Date(c("2020-01-01", NA, "2020-12-12")))
Upvotes: 1