Reputation: 43
I need some help analyzing survey data.
Here is my code. Data prep
library(survey)
library(srvyr)
data(api)
dclus2 <- apiclus1 %>%
as_survey_design(dnum, weights = pw, fpc = fpc)
These two codes give me the same result.
One using the package survey
#Code
survey::svymean(~awards, dclus2)
#Results
mean SE
awardsNo 0.28962 0.033
awardsYes 0.71038 0.033
One using the package srvyr
#Code
srvyr::dclus2%>%
group_by(awards)%>%
summarise(m=survey_mean())
#Results
awards m m_se
No 0.2896175 0.0330183
Yes 0.7103825 0.0330183
I would like to get the survey mean of by the variable "awards" subset by the variable "stype" with levels No and Yes.
In the survey package, interaction is used
eg.svymean(~interaction(awards,stype), dclus2)
How do I get the same result using the srvyr package?
Thank you for your help
How do get the result below using the package srvyr?
#Code
svymean(~interaction(awards,stype), dclus2)
#Results
mean SE
interaction(awards, stype)No.E 0.180328 0.0250
interaction(awards, stype)Yes.E 0.606557 0.0428
interaction(awards, stype)No.H 0.043716 0.0179
interaction(awards, stype)Yes.H 0.032787 0.0168
interaction(awards, stype)No.M 0.065574 0.0230
interaction(awards, stype)Yes.M 0.071038 0.0203
Upvotes: 1
Views: 745
Reputation: 6277
You can simply imitate the recommended behavior for survey
: create a new variable formed by concatenating distinct values of each of the component variables. That's all that the interaction()
function is doing for svymean()
.
library(survey)
library(srvyr)
data(api)
# Set up design object
dclus2 <- apiclus1 %>%
as_survey_design(dnum, weights = pw, fpc = fpc)
# Create 'interaction' variable
dclus2 %>%
mutate(awards_stype = paste(awards, stype, sep = " - ")) %>%
group_by(awards_stype) %>%
summarize(
prop = survey_mean()
)
#> # A tibble: 6 x 3
#> awards_stype prop prop_se
#> <chr> <dbl> <dbl>
#> 1 No - E 0.180 0.0250
#> 2 No - H 0.0437 0.0179
#> 3 No - M 0.0656 0.0230
#> 4 Yes - E 0.607 0.0428
#> 5 Yes - H 0.0328 0.0168
#> 6 Yes - M 0.0710 0.0203
To get the various component variables split back into separate columns, you can use the separate()
function from the tidyr
package.
# Separate the columns afterwards
dclus2 %>%
mutate(awards_stype = paste(awards, stype, sep = " - ")) %>%
group_by(awards_stype) %>%
summarize(
prop = survey_mean()
) %>%
tidyr::separate(col = "awards_stype",
into = c("awards", "stype"),
sep = " - ")
#> # A tibble: 6 x 4
#> awards stype prop prop_se
#> <chr> <chr> <dbl> <dbl>
#> 1 No E 0.180 0.0250
#> 2 No H 0.0437 0.0179
#> 3 No M 0.0656 0.0230
#> 4 Yes E 0.607 0.0428
#> 5 Yes H 0.0328 0.0168
#> 6 Yes M 0.0710 0.0203
Created on 2021-03-30 by the reprex package (v1.0.0)
Upvotes: 1