Faiz Lotfy
Faiz Lotfy

Reputation: 121

Select only rows if the value in a particular set of columns is 'NA' in R

I have a data frame with many rows and columns in it (3000x37) and I want to be able to select only rows that may have >= 2 columns of value "NA". These columns have data of different data types. I know how to do this in case I want to select only one column via:

df[is.na(df$col.name), ]

How to make this selection if I want to select two (or more) columns?

Upvotes: 0

Views: 122

Answers (1)

pcantalupo
pcantalupo

Reputation: 2226

First create a vector nn with the of the number of NA's in each row and then select only those rows with >= 2 NA's d[nn>=2,]

d = data.frame(x=c(NA,1,2,3), y=c(NA,"a",NA,"c"))
nn = apply(d, 1, FUN=function (x) {sum(is.na(x))})
d[nn>=2,]

   x    y
1 NA <NA>

Upvotes: 1

Related Questions