Reputation: 1
I'm using CPS data, which surveys households and collects information on each individual within them. I'm endeavoring to create a variable indicating if a given individual lives in a household with a person with a disability that is not themselves (i.e.: they live in a household in which someone else has a disability, regardless of whether they themselves have one).
I know that this would be relatively easy to do with dplyr if I didn't need to exclude the person's own disability status from the calculation - simply taking the max value of disability for each household using the group_by() and mutate() functions.
cps<-cps %>%
group_by(householdid) %>%
mutate(maxDISABLED_HH = max(diffany, na.rm=TRUE))
But it's not totally clear to me how to do this while excluding the disability status of the observation I'm generating the variable for. Can anyone advise?
Sample data to work with below:
cps<-data.frame(householdid = c(1,1,1,2,2,2,2,3,3,3,4,4,4,4,5,5,5,5,5), diffany = c(1,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0))
TL;DR: I want to create a variable consisting of the max variable for a group for all observations except the observation I'm generating the variable for. I've tried using group_by() and mutate() but haven't figured out how to exclude the observation I'm generating the variable for.
Upvotes: 0
Views: 32
Reputation: 17229
Assuming it does not have to be the result of max()
as long as those subjects are detected. For cases where diffany == 0
, subtracting does not change the group sum; if observation happens to be the only diffany == 1
case in the group, sum(diffany) - diffany
evaluates to 0
and that observation will not be flagged.
cps <- data.frame(householdid = c(1,1,1,2,2,2,2,3,3,3,4,4,4,4,5,5,5,5,5),
diffany = c(1,1,0,0,0,0,1,0,0,1,0,1,0,0,0,0,0,0,0))
cps |>
dplyr::mutate(with_dis = (sum(diffany) - diffany) > 0, .by = householdid)
#> householdid diffany with_dis
#> 1 1 1 TRUE
#> 2 1 1 TRUE
#> 3 1 0 TRUE
#> 4 2 0 TRUE
#> 5 2 0 TRUE
#> 6 2 0 TRUE
#> 7 2 1 FALSE
#> 8 3 0 TRUE
#> 9 3 0 TRUE
#> 10 3 1 FALSE
#> 11 4 0 TRUE
#> 12 4 1 FALSE
#> 13 4 0 TRUE
#> 14 4 0 TRUE
#> 15 5 0 FALSE
#> 16 5 0 FALSE
#> 17 5 0 FALSE
#> 18 5 0 FALSE
#> 19 5 0 FALSE
Created on 2023-06-04 with reprex v2.0.2
Upvotes: 0