Reputation: 3764
I need to remove years that do not have measurements for every day of the year. Pretend this is a full set and I want to get rid of all 2001 rows because 2001 has one missing measurement.
year day value
2000 1 5
2000 2 3
2000 3 2
2000 4 3
2001 1 2
2001 2 NA
2001 3 6
2001 4 5
Sorry I don't have code attempts, I can't wrap my head around it right now and it took me forever to get this far. Prefer something I can %>% in, as it's at the end of a long run.
Upvotes: 1
Views: 62
Reputation: 47310
In base R you could do:
subset(df,!year %in% year[is.na(value)])
# year day value
# 1 2000 1 8
# 2 2000 2 5
# 3 2000 3 4
# 4 2000 4 1
Upvotes: 1
Reputation: 11140
Here's a one line solution using base R -
df %>% .[!ave(.$value, .$year, FUN = anyNA), ]
Example -
df <- data.frame(year = c(rep(2000, 4), rep(2001, 4)), day = 1:4, value = sample.int(10, 8))
df$value[6] <- NA_integer_
# year day value
# 1 2000 1 4
# 2 2000 2 3
# 3 2000 3 2
# 4 2000 4 7
# 5 2001 1 8
# 6 2001 2 NA
# 7 2001 3 1
# 8 2001 4 5
df %>% .[!ave(.$value, .$year, FUN = anyNA), ]
# year day value
# 1 2000 1 4
# 2 2000 2 3
# 3 2000 3 2
# 4 2000 4 7
Upvotes: 1
Reputation: 145775
Filtering based on presence of NA
values:
df %>%
group_by(year) %>%
filter(!anyNA(value))
Alternative filter
conditions (pick what you find most readable):
all(!is.na(value))
sum(is.na(value)) == 0
!any(is.na(value))
Upvotes: 5