Reputation: 21
I have a dataset df
like this, which is the data collected from individuals using a repeating instrument:
ID <- c('A1', 'A1', 'A1', 'A1', 'A2', 'A2', 'A2', 'A2', 'A2', 'A2', 'A3', 'A3', 'A3', 'A3', 'A4', 'A4', 'A4', 'A4', 'A4', 'A4', 'A4', 'A5', 'A5', 'A5', 'A5', 'A5', 'A5')
day_stat <- c(2, 1, 1, 2, 2, 2, 2, 1, 1, 1, 2, NA, NA, NA, NA, NA, 1, 1, 1, NA, NA, 1, 2, 2, 2, 1, NA)
adm_dat <- c(NA, NA, NA, NA, NA, NA, NA, '2020-10-12', NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, '2020-10-18', NA, NA)
adm_ever <- c(NA, NA, NA, 1, NA, NA, NA, NA, NA, 2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 1, NA, NA, NA, NA, NA, NA)
df <- data.frame(ID, day_stat, adm_dat, adm_ever)
I am trying to filter the data like this:
df1 = df %>% filter(day_stat==1 | adm_dat!= NA | adm_ever==1)
Current result (not wanted):
If one of these filter conditions is true for an ID
, then keep all event data of that ID
.
Upvotes: 0
Views: 49
Reputation: 886938
We can use data.table
library(data.table)
setDT(df)[, .SD[any(day_stat==1 | !is.na(adm_dat) | adm_ever==1)], ID]
Upvotes: 0
Reputation: 388807
To check for NA
values use is.na
and to select entire group use group_by
:
library(dplyr)
df %>%
group_by(ID) %>%
filter(any(day_stat==1 | !is.na(adm_dat) | adm_ever==1))
# ID day_stat adm_dat adm_ever
# <chr> <dbl> <chr> <dbl>
# 1 A1 2 NA NA
# 2 A1 1 NA NA
# 3 A1 1 NA NA
# 4 A1 2 NA 1
# 5 A2 2 NA NA
# 6 A2 2 NA NA
# 7 A2 2 NA NA
# 8 A2 1 2020-10-12 NA
# 9 A2 1 NA NA
#10 A2 1 NA 2
# … with 13 more rows
Upvotes: 1