Reputation: 317
I have a dataset in a region where schools are segregated by gender, and I am thinking of comparing gender performance within the same school, but to do that, I want to limit my data to only include schools teaching both genders. In other words, I would like to remove schools that only teach either females or males.
Below is my current code, but its giving me zero observations although includes several schools teaching both genders:
## Limit Riyadh schools only to schools teaching both genders
two_gender_schools <- filter(riyadh_scores, school_name == "",
gender == "male", gender == "female")
My question is, is there an efficient way to subset my data without having to manually specify each school name teaching both genders?
Upvotes: 1
Views: 82
Reputation: 146224
When you give filter
multiple conditions, it combines them with "and". So your code looks for rows where the school name is blank (school_name == ""
), and the gender is "male", and the gender is "female".
Instead, you should group_by(school_name)
and proceed from there. A couple options:
two_gender_schools_a = riyadh_schools %>%
group_by(school_name) %>%
filter("female" %in% gender & "male" %in% gender)
# %in% checks anywhere in the group, not row by row
two_gender_schools_b = riyadh_schools %>%
group_by(school_name) %>%
filter(n_distinct(gender) > 1)
# look for schools that have more than 1 distinct value for gender
Upvotes: 5