Reputation: 219
I have a survey dataset which includes intra-household relationships and I'm trying to write code to identify lone parent households, defined as a household where the parent of a dependent child does not have a cohabiting partner.
Intra-family relationships are coded as:
1
= Spouse, 2
= Cohabiting partner, 3
= Son/daughter, 4
= Step son/daughter, 5
= Foster child, 6
= Son-in-law/daughter-in-law, 7
= Parent/guardian, 8
= Step-parent, 9
= Foster parent, 10
= Parent-in-law, 11
= Brother/sister, 12
= Step-brother/sister, 13
= Foster brother/sister, 14
= Brother/sister-in-law, 15
= Grand-child, 16
= Grand-parent, 17
= Other relative, 18
= Other non-relative.
The identifiers of a parent therefore are 7
, 8
, or 9
in any of a person's relationship columns, however whether their child is dependent (under18) is represented in the child's depchild
column. Whether the parent of a depchild
has a partner is identified by 1
or 2
in any of the parents relationship columns.
I can't preclude the possibility of multiple families within a given household, e.g. (two lone mothers independently living with two dependent children) therefore the presence of two parents within a household with dependent children does not automatically mean a non-lone-parent household. If there are any lone-parents in a household i.e. a parent of a dependent child who does not have a partner, the household should be tagged as lonepar = 1
.
Example Data
household person depchild R01 R02 R03 R04 R05 R06
1 1 1 0 NA 1 7 7 NA NA
2 1 2 0 1 NA 7 7 NA NA
3 1 3 0 3 3 NA 11 NA NA
4 1 4 1 3 3 11 NA NA NA
5 2 1 0 NA 7 16 NA NA NA
6 2 2 0 3 NA 7 NA NA NA
7 2 3 1 15 3 NA NA NA NA
8 3 1 0 NA 18 NA NA NA NA
9 3 2 0 18 NA NA NA NA NA
10 4 1 0 NA NA NA NA NA NA
11 5 1 0 NA 9 NA NA NA NA
12 5 2 1 5 NA 18 NA NA NA
13 5 3 0 2 18 NA NA NA NA
In the above example, dependent children depchild
are on rows 4, 7 and 12. The parents of the child on row 4 have spouses, indicated by 1
in R02
and R01
respectively; the household is therefore not a lone-parent household, so should be lonepar = 0
. The parent of the depchild
on row 7 however (row 6) does not have a spouse 1
or a cohabiting partner 2
, the household should therefore be lonepar = 1
Output sought
household person depchild R01 R02 R03 R04 R05 R06 lonepar
1 1 1 0 NA 1 7 7 NA NA 0
2 1 2 0 1 NA 7 7 NA NA 0
3 1 3 0 3 3 NA 11 NA NA 0
4 1 4 1 3 3 11 NA NA NA 0
5 2 1 0 NA 7 16 NA NA NA 1
6 2 2 0 3 NA 7 NA NA NA 1
7 2 3 1 15 3 NA NA NA NA 1
8 3 1 0 NA 18 NA NA NA NA 0
9 3 2 0 18 NA NA NA NA NA 0
10 4 1 0 NA NA NA NA NA NA 0
11 5 1 0 NA 9 NA NA NA NA 0
12 5 2 1 5 NA 18 NA NA NA 0
13 5 3 0 2 18 NA NA NA NA 0
Example Code
df <- data.frame(household = c(1,1,1,1,2,2,2,3,3,4,5,5,5),
person = c(1,2,3,4,1,2,3,1,2,1,1,2,3),
depchild = c(0,0,0,1,0,0,1,0,0,0,0,1,0),
R01 = c(NA, 1, 3, 3, NA, 3, 15, NA, 18, NA, NA, 5,2),
R02 = c(1, NA, 3, 3, 7, NA, 3, 18, NA, NA, 9, NA, 18),
R03 = c(7, 7, NA, 11, 16, 7, rep(NA,5), 18, NA),
R04 = c(7, 7, 11, rep(NA, 10)),
R05 = rep(NA, 13),
R06 = rep(NA, 13))
Upvotes: 1
Views: 44
Reputation: 174468
Rather than concentrating on relationships of the parents, concentrate on the relationships of the dependent children. If a dependent child only has a single relation with a value of 3, 4, or, 5, then that dependent child only has a single parent in the household.
Essentially, we count up the instances of 3, 4, and 5 in each row for every person in the data frame. Then we group by household. If anyone in that household is a dependent child who only had one instance of a 3, 4, or 5 relationship code, then that household contains a dependent child with only one parent. It is therefore a lone parent household.
library(tidyverse)
df %>%
rowwise() %>%
mutate(n = length(na.omit(match(c(R01, R02, R03, R04, R05, R06), 3:5)))) %>%
group_by(household) %>%
mutate(lonepar = as.numeric(any(depchild == 1 & n == 1))) %>%
select(-n)
#> # A tibble: 12 x 10
#> # Groups: household [5]
#> household person depchild R01 R02 R03 R04 R05 R06 lonepar
#> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <lgl> <lgl> <dbl>
#> 1 1 1 0 NA 1 7 7 NA NA 0
#> 2 1 2 0 1 NA 7 7 NA NA 0
#> 3 1 3 0 3 3 NA 11 NA NA 0
#> 4 1 4 1 3 3 11 NA NA NA 0
#> 5 2 1 0 NA 7 16 NA NA NA 1
#> 6 2 2 0 3 NA 7 NA NA NA 1
#> 7 2 3 1 15 3 NA NA NA NA 1
#> 8 3 1 0 NA 18 NA NA NA NA 0
#> 9 3 2 0 18 NA NA NA NA NA 0
#> 10 4 1 0 NA NA NA NA NA NA 0
#> 11 5 1 0 NA 9 NA NA NA NA 1
#> 12 5 2 1 5 NA NA NA NA NA 1
Created on 2022-05-16 by the reprex package (v2.0.1)
Upvotes: 1