aelhak
aelhak

Reputation: 415

Restrict a dataframe based on age at measurement

I have a dataframe in long format with repeated data on id, age and height. How can I restrict the dataset so that it includes only people with at least 1 measurement taken from age 5 onwards and also at least 1 measurement between age 9 and 20 years.

(So if a person has only 1 height measurement and it is measured before age 9 they will be excluded (because they don't also have another measure between 9 and 20)

# LOAD SITAR PACKAGE WITH EXAMPLE DATASET 
library(sitar)

data <- berkeley %>% select(id, age, height)
summary(data)

#THIS RESTRICTS TO HEIGHTS TAKEN >= age 5: HOW TO ALSO RESTICT TO >=1 MEASURE BETWEEN AGE 9 and 20?
data <- data %>% filter(age!="NA" & height!="NA" & age>=5)

Upvotes: 0

Views: 121

Answers (2)

akrun
akrun

Reputation: 887213

An option with data.table

library(data.table)
setDT(data)[, .SD[any(age > 5 & between(age, 9, 20))], id]

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389012

You could do

library(dplyr)

data %>%
  group_by(id) %>%
  filter(any(age  > 5 & between(age, 9,20))) 

But it seems in your example all the ids satisfies both the criterias.

Upvotes: 2

Related Questions