Nazer
Nazer

Reputation: 3764

Remove group that has NAs in only some rows

I need to remove years that do not have measurements for every day of the year. Pretend this is a full set and I want to get rid of all 2001 rows because 2001 has one missing measurement.

year    day    value  
2000     1       5
2000     2       3  
2000     3       2
2000     4       3  
2001     1       2
2001     2       NA  
2001     3       6  
2001     4       5

Sorry I don't have code attempts, I can't wrap my head around it right now and it took me forever to get this far. Prefer something I can %>% in, as it's at the end of a long run.

Upvotes: 1

Views: 62

Answers (3)

moodymudskipper
moodymudskipper

Reputation: 47310

In base R you could do:

subset(df,!year %in% year[is.na(value)])
#   year day value
# 1 2000   1     8
# 2 2000   2     5
# 3 2000   3     4
# 4 2000   4     1

Upvotes: 1

Shree
Shree

Reputation: 11140

Here's a one line solution using base R -

df %>% .[!ave(.$value, .$year, FUN = anyNA), ]

Example -

df <- data.frame(year = c(rep(2000, 4), rep(2001, 4)), day = 1:4, value = sample.int(10, 8))
df$value[6] <- NA_integer_

#   year day value
# 1 2000   1     4
# 2 2000   2     3
# 3 2000   3     2
# 4 2000   4     7
# 5 2001   1     8
# 6 2001   2    NA
# 7 2001   3     1
# 8 2001   4     5

df %>% .[!ave(.$value, .$year, FUN = anyNA), ]

#   year day value
# 1 2000   1     4
# 2 2000   2     3
# 3 2000   3     2
# 4 2000   4     7

Upvotes: 1

Gregor Thomas
Gregor Thomas

Reputation: 145775

Filtering based on presence of NA values:

df %>% 
 group_by(year) %>%
 filter(!anyNA(value))

Alternative filter conditions (pick what you find most readable):

all(!is.na(value))
sum(is.na(value)) == 0
!any(is.na(value))

Upvotes: 5

Related Questions