Ama Owusu-Darko
Ama Owusu-Darko

Reputation: 43

What is the equivalent of survey::svymean(~interaction()) using the srvyr package?

I need some help analyzing survey data.

Here is my code. Data prep

library(survey)
library(srvyr)
data(api)

dclus2 <- apiclus1 %>%
  as_survey_design(dnum, weights = pw, fpc = fpc)

These two codes give me the same result.

One using the package survey

#Code
survey::svymean(~awards, dclus2)

#Results
             mean    SE
awardsNo  0.28962 0.033
awardsYes 0.71038 0.033

One using the package srvyr

#Code
srvyr::dclus2%>%
       group_by(awards)%>%
       summarise(m=survey_mean())

#Results
awards    m            m_se
No     0.2896175    0.0330183       
Yes    0.7103825    0.0330183

I would like to get the survey mean of by the variable "awards" subset by the variable "stype" with levels No and Yes.

In the survey package, interaction is used eg.svymean(~interaction(awards,stype), dclus2) How do I get the same result using the srvyr package?

Thank you for your help

How do get the result below using the package srvyr?

#Code
svymean(~interaction(awards,stype), dclus2)

#Results
                                    mean     SE
interaction(awards, stype)No.E  0.180328 0.0250
interaction(awards, stype)Yes.E 0.606557 0.0428
interaction(awards, stype)No.H  0.043716 0.0179
interaction(awards, stype)Yes.H 0.032787 0.0168
interaction(awards, stype)No.M  0.065574 0.0230
interaction(awards, stype)Yes.M 0.071038 0.0203

Upvotes: 1

Views: 745

Answers (1)

bschneidr
bschneidr

Reputation: 6277

You can simply imitate the recommended behavior for survey: create a new variable formed by concatenating distinct values of each of the component variables. That's all that the interaction() function is doing for svymean().

library(survey)
library(srvyr)

data(api)

# Set up design object
dclus2 <- apiclus1 %>%
  as_survey_design(dnum, weights = pw, fpc = fpc)

# Create 'interaction' variable
dclus2 %>%
  mutate(awards_stype = paste(awards, stype, sep = " - ")) %>%
  group_by(awards_stype) %>%
  summarize(
    prop = survey_mean()
  )
#> # A tibble: 6 x 3
#>   awards_stype   prop prop_se
#>   <chr>         <dbl>   <dbl>
#> 1 No - E       0.180   0.0250
#> 2 No - H       0.0437  0.0179
#> 3 No - M       0.0656  0.0230
#> 4 Yes - E      0.607   0.0428
#> 5 Yes - H      0.0328  0.0168
#> 6 Yes - M      0.0710  0.0203

To get the various component variables split back into separate columns, you can use the separate() function from the tidyr package.

# Separate the columns afterwards
dclus2 %>%
  mutate(awards_stype = paste(awards, stype, sep = " - ")) %>%
  group_by(awards_stype) %>%
  summarize(
    prop = survey_mean()
  ) %>%
  tidyr::separate(col = "awards_stype",
                  into = c("awards", "stype"),
                  sep = " - ")
#> # A tibble: 6 x 4
#>   awards stype   prop prop_se
#>   <chr>  <chr>  <dbl>   <dbl>
#> 1 No     E     0.180   0.0250
#> 2 No     H     0.0437  0.0179
#> 3 No     M     0.0656  0.0230
#> 4 Yes    E     0.607   0.0428
#> 5 Yes    H     0.0328  0.0168
#> 6 Yes    M     0.0710  0.0203

Created on 2021-03-30 by the reprex package (v1.0.0)

Upvotes: 1

Related Questions