user111024
user111024

Reputation: 811

If a column is NA, calculate row mean on other columns using dplyR

In the example below how can I calculate the row mean when column A is NA? The row mean would replace the NA in column A. Using base R, I can use this:

foo <- tibble(A = c(3,5,NA,6,NA,7,NA),
              B = c(4,5,4,5,6,4,NA),
              C = c(6,5,2,8,8,5,NA))
foo

tmp <- rowMeans(foo[,-1],na.rm = TRUE)
foo$A[is.na(foo$A)] <- tmp[is.na(foo$A)]
foo$A[is.nan(foo$A)] <- NA

Curious how I might do this with dplyR?

Upvotes: 2

Views: 171

Answers (3)

mt1022
mt1022

Reputation: 17289

Use coalesce:

foo %>%
    mutate(m = rowMeans(across(), na.rm = T),
        A = if_else(is.na(A) & !is.na(m), m, A)) %>%
    select(-m)

# # A tibble: 7 x 3
#       A     B     C
#   <dbl> <dbl> <dbl>
# 1     3     4     6
# 2     5     5     5
# 3     3     4     2
# 4     6     5     8
# 5     7     6     8
# 6     7     4     5
# 7    NA    NA    NA

Upvotes: 2

www
www

Reputation: 39154

Here is a solution that not only replace NA in column A, but for all columns in the data frame.

library(dplyr)

foo2 <- foo %>%
  mutate(RowMean = rowMeans(., na.rm = TRUE)) %>%
  mutate(across(-RowMean, .fns = 
                  function(x) ifelse(is.na(x) & !is.nan(RowMean), RowMean, x))) %>%
  select(-RowMean)

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 388982

You can use ifelse :

library(dplyr)

foo %>% 
  mutate(A = ifelse(is.na(A), rowMeans(., na.rm = TRUE), A), 
         A = replace(A, is.nan(A), NA))

#      A     B     C
#  <dbl> <dbl> <dbl>
#1     3     4     6
#2     5     5     5
#3     3     4     2
#4     6     5     8
#5     7     6     8
#6     7     4     5
#7    NA    NA    NA

Upvotes: 2

Related Questions