psycho95
psycho95

Reputation: 141

Remove multiple rows if condition for one row is met

I have the following dataset:

ID   value
abc    1
abc    NA
abc    2
def    5
def    1 
def    4

I know how to identify the ID of the row that contains a NA . What I want to do is to delete all rows with the ID if one row contains a NA. In this case: One row of abc shows an NA, thus all rows with ID = abc should be removed, so that the dataframe looks like this:

ID   value
def    5
def    1 
def    4

Upvotes: 1

Views: 519

Answers (4)

ThomasIsCoding
ThomasIsCoding

Reputation: 101247

A data.table option

> setDT(df)[, .SD[!any(is.na(value))], ID]
    ID value
1: def     5
2: def     1
3: def     4

Data

> dput(df)
structure(list(ID = c("abc", "abc", "abc", "def", "def", "def"
), value = c(1L, NA, 2L, 5L, 1L, 4L)), class = "data.frame", row.names = c(NA,
-6L))

Upvotes: 0

akrun
akrun

Reputation: 887068

Using subset with table

 subset(df1, ID %in% names(which(!table(ID, is.na(value))[, 2])))

Upvotes: 0

G. Grothendieck
G. Grothendieck

Reputation: 269501

Using DF shown reproducibly in the Note at the end ave creates a vector which is TRUE for each element that is part of an ID group in which any of the elements are NA and then we subset to all but those.

subset(DF, !ave(value, ID, FUN = anyNA))

giving:

   ID value
4 def     5
5 def     1
6 def     4

Using de Morgan's law this could be equivalently expressed as:

subset(DF, ave(!is.na(value), ID, FUN = all))

Note

Lines <- "ID   value
abc    1
abc    NA
abc    2
def    5
def    1 
def    4"
DF <- read.table(text = Lines, header = TRUE)

Upvotes: 1

GKi
GKi

Reputation: 39657

You can use the negated ! %in% of ID which have an NA to delete all rows with the ID if one row contains a NA.

x[!x$ID %in% unique(x$ID[is.na(x$value)]),]
#   ID value
#4 def     5
#5 def     1
#6 def     4

Upvotes: 3

Related Questions