Reputation: 73
I'm trying to analyse a multiple response question from a weighted survey dataset. I like the srvyr
package because it allows me to use the dplyr pipes, but I can't find the reference material on how to handle multiple response questions.
I have a simple dataset looking at different sources of income. Here's an example of how the data looks like
ID <- c(1,2,3,4,5,6,7,8,9,10)
rent_income <- c("Yes", "Yes", "No", "Yes", "No", "Yes", "No", "Yes", "No", "No")
salary_income <- c( "No", "Yes", "No", "Yes", "No", "Yes", "Yes", "No", "Yes", "No")
other_income <- c( "No", "Yes", "No", "No", "No", "No", "Yes", "No", "No", "No")
survey_weights <- c(0.6, 1.2 , 1.1 , 0.7 , 2.4 , 1.1 , 0.3 , 0.6 , 1.0 , 0.8)
df<-data.frame(ID, rent_income, salary_income, other_income, survey_weights)
Note that the data is entirely made up. With srvyr
if first have to create a survey object
weighted_dataset <- df %>% as_survey_design(ids=ID, weights=survey_weights)
Now I would want to calculate the weighted percentage of the sample that has different types of incomes. Any ideas on how to do that? In Stata there is a function called mr_tab . But I can't find a similar one in R
Upvotes: 2
Views: 1028
Reputation: 6278
You could use the convenient group_by()
and variable selection syntax available through the dplyr
and srvyr
R packages.
weighted_dataset %>%
# Organize the data into groups defined by each combination of the income variables
group_by_at(vars(ends_with("_income"))) %>%
# For categorical variables, this calculates estimates of percentages
summarize(Percent = survey_mean())
> # A tibble: 6 x 5
> rent_income salary_income other_income Percent Percent_se
> <fct> <fct> <fct> <dbl> <dbl>
> 1 No No No 1 0
> 2 No Yes No 0.769 0.265
> 3 No Yes Yes 0.231 0.265
> 4 Yes No No 1 0
> 5 Yes Yes No 0.6 0.312
> 6 Yes Yes Yes 0.40 0.312
Upvotes: 1
Reputation: 6094
see the proportions by group
block of https://cran.r-project.org/web/packages/srvyr/vignettes/srvyr-vs-survey.html
Upvotes: 0