Reputation: 415
I have a dataframe in long format with repeated data on id, age and height. How can I restrict the dataset so that it includes only people with at least 1 measurement taken from age 5 onwards and also at least 1 measurement between age 9 and 20 years.
(So if a person has only 1 height measurement and it is measured before age 9 they will be excluded (because they don't also have another measure between 9 and 20)
# LOAD SITAR PACKAGE WITH EXAMPLE DATASET
library(sitar)
data <- berkeley %>% select(id, age, height)
summary(data)
#THIS RESTRICTS TO HEIGHTS TAKEN >= age 5: HOW TO ALSO RESTICT TO >=1 MEASURE BETWEEN AGE 9 and 20?
data <- data %>% filter(age!="NA" & height!="NA" & age>=5)
Upvotes: 0
Views: 121
Reputation: 887213
An option with data.table
library(data.table)
setDT(data)[, .SD[any(age > 5 & between(age, 9, 20))], id]
Upvotes: 1
Reputation: 389012
You could do
library(dplyr)
data %>%
group_by(id) %>%
filter(any(age > 5 & between(age, 9,20)))
But it seems in your example all the id
s satisfies both the criterias.
Upvotes: 2