Reputation: 333
I have a data frame that contains several scattered NA values. I would like to fill those NAs with the values immediately preceding it in the cell to the left (same row) or the following cell to the right (same row) if a value doesn't exist to the left or is NA. It seems like using zoo::na.locf
or tidyr::fill()
can help with this but it only seems to work by taking the previous/next value either above or below in the same column.
I currently have this code but it's only filling based on above values in same column:
lapply(df, function(x) zoo::na.locf(zoo::na.locf(x, na.rm = FALSE), fromLast = TRUE))
My dataframe df looks like this:
C1 C2 C3 C4
1 2 1 9 2
2 NA 5 1 1
3 1 NA 3 8
4 3 NA NA 4
structure(list(C1 = c(2, NA, 1, 3), C2 = c(1, 5, NA, NA), C3 = c(9,
1, 3, NA), C4 = c(2, 1, 8, 4)), row.names = c(NA, 4L), class = "data.frame")
After filling the NA values, I would like it to look like this:
C1 C2 C3 C4
1 2 1 9 2
2 5 5 1 1
3 1 1 3 8
4 3 3 3 4
Upvotes: 1
Views: 687
Reputation: 887088
With apply
and na.locf
library(zoo)
df[] <- t(apply(df, 1, function(x) na.locf0(na.locf0(x), fromLast = TRUE)))
-output
df
# C1 C2 C3 C4
#1 2 1 9 2
#2 5 5 1 1
#3 1 1 3 8
#4 3 3 3 4
Upvotes: 2
Reputation: 388972
na.locf
can directly work on dataframes but it works column-wise. If you want to make it run row-wise you can transpose the dataframe. You can also use fromLast = TRUE
to fill the data from opposite direction. Finally, we use coalesce
to select the first non-NA value from the two vectors.
library(zoo)
df[] <- dplyr::coalesce(c(t(na.locf(t(df), na.rm = FALSE))),
c(t(na.locf(t(df), na.rm = FALSE, fromLast = TRUE))))
df
# C1 C2 C3 C4
#1 2 1 9 2
#2 5 5 1 1
#3 1 1 3 8
#4 3 3 3 4
Upvotes: 1
Reputation: 3134
This is indeed not the usual way to store data, but if you just transpose you can use tidyr::fill()
. Only downside is that it adds quite a bit of wrapping code.
xx <- structure(list(C1 = c(2, NA, 1, 3), C2 = c(1, 5, NA, NA), C3 = c(9,
1, 3, NA), C4 = c(2, 1, 8, 4)), row.names = c(NA, 4L), class = "data.frame")
xx %>%
t() %>%
as_tibble() %>%
tidyr::fill(everything(), .direction = "downup") %>%
t() %>%
as_tibble() %>%
set_names(names(xx))
# A tibble: 4 x 4
# C1 C2 C3 C4
# <dbl> <dbl> <dbl> <dbl>
#1 2 1 9 2
#2 5 5 1 1
#3 1 1 3 8
#4 3 3 3 4
Upvotes: 3