Kuni
Kuni

Reputation: 865

How to find if there are NA in each row in multiple columns?

I have this dataframe in R

df <- structure(list(S.No = c(1L, 2L, 3L, 8L, 5L, 6L), A = c(8L, 8L, 
5L, 2L, NA, 3L), B = c(5L, 8L, 8L, 2L, NA, 3L), C = c("8", "test", 
"error", "3", NA, "3"), D = c(5L, 5L, 3L, 3L, NA, 8L), E = c("test", 
"8", "8", "error", NA, "3")), class = "data.frame", row.names = c(NA, 
-6L))

I need to find out if all values of the columns for each row have NA. It needs to be rowwise, but I can't get this to work. This is what I have tried so far

test.vars = c("A","B","C","D","E")

df %>% mutate(null_message = as.numeric(is.na(rowSums(.[test.vars]))))

This works if my columns have only numeric values. So, I tried to do something else:

df %>% mutate(null_message = any(is.na((.[test.vars]))))

but this doesn't work. It shows all rows as TRUE, and I know why. I can use | but it seems bit tedious to do is.na(A) | is.na(B) | is.na(C) | ... . Is there a way I can get this done efficiently?

The expected output is the following:

enter image description here

Upvotes: 3

Views: 1342

Answers (2)

Lennyy
Lennyy

Reputation: 6132

You could do:

df %>% 
  rowwise %>% 
  mutate(null_message = as.integer(all(c(A,B,C,D,E) %in% NA)))

# A tibble: 6 x 7
# Rowwise: 
   S.No     A     B C         D E         null_message
  <int> <int> <int> <chr> <int> <chr> <int>
1     1     8     5 8         5 test      0
2     2     8     8 test      5 8         0
3     3     5     8 error     3 8         0
4     8     2     2 3         3 error     0
5     5    NA    NA NA       NA NA        1
6     6     3     3 3         8 3         0

Upvotes: 2

Ronak Shah
Ronak Shah

Reputation: 389355

You are on the right path :

library(dplyr)
test.vars = c("A","B","C","D","E")

df %>% mutate(null_message = as.numeric(rowSums(is.na(.[test.vars])) == length(test.vars)))

#  S.No  A  B     C  D     E null_message
#1    1  8  5     8  5  test            0
#2    2  8  8  test  5     8            0
#3    3  5  8 error  3     8            0
#4    8  2  2     3  3 error            0
#5    5 NA NA  <NA> NA  <NA>            1
#6    6  3  3     3  8     3            0

This reads assign 1 if number of NA values in the row is same as length(test.vars).

Or in other way :

df %>% mutate(null_message = as.numeric(rowSums(!is.na(.[test.vars])) == 0))

This reads assign 1 if number of non-NA value in the row is 0.

Upvotes: 3

Related Questions