Nate
Nate

Reputation: 61

R- How to conduct two-sample t-test with two different survey designs

I want to perform a two-sample (welch's) t-test on the equality of two means, one of which is obtained using simple random sampling (srsmean), and the other which is calculated using survey weighting with the survey package (mean_weighted). I also conduct a t-test between mean_weighted and the mean obtained when weighting and stratification are both implemented in the survey design (mean_strat).

I know there is a svyttest() function, however, as far as I can tell, this function only tests the means of two samples within one survey design, not means obtained with different survey designs.

I also tried using rnorm to create fictional samples eg c(rnorm(9710, mean = 156958.8, sd = 364368)), but the problem with this approach is that in complex sampling methods like stratification, the effective n is usually smaller than the nominal n, and so I am unsure what to put as n. Additionally, this method feels a bit contrived, as I would be fitting the data to a particular type of distribution.

Finally, I tried writing out the equation for a t-statistic myself, however, in calculating the "standard error of the difference of means" involving a complex survey design, I also run into problems related to the "effective sample size."

Is there another approach that would work for both the t-test between srsmean, mean_weighted AND the t-test between mean_weighted, mean_strat?

library(survey)

wel <- c(68008.19, 128504.61,  21347.69,
         33272.95,  61828.96,  32764.44,
         92545.62,  58431.89,  95596.82,
         117734.27)
rmul <- c(16, 16, 16, 16, 16, 16, 16,
          20, 20, 20)
strat <- c(101, 101, 101, 101, 101, 102, 102, 102, 102, 102)


survey.data <- data.frame(wel, rmul, strat)

srsmean <- mean(survey.data$wel)

survey_weighted <- svydesign(data = survey.data,
                             ids = ~wel, 
                             weights = ~rmul, 
                             nest = TRUE)

mean_weighted <- svymean(~wel, survey_weighted)

survey_strat <- survey_strat <- svydesign(data = surveydata, 
                                          ids= ~wel, 
                                          weights = ~rmul, 
                                          strata = ~strat, 
                                          nest = TRUE)
mean_strat <- svymean(~wel, survey_strat)

Upvotes: 2

Views: 777

Answers (1)

Anthony Damico
Anthony Damico

Reputation: 6104

i'm confused about the purpose of a t-test between your mean_weighted and mean_strat since the difference between those coefficients will always be zero? i might compare the simple random sample against the complex design like this?

library(survey)

wel <- c(68008.19, 128504.61,  21347.69,
         33272.95,  61828.96,  32764.44,
         92545.62,  58431.89,  95596.82,
         117734.27)
rmul <- c(16, 16, 16, 16, 16, 16, 16,
          20, 20, 20)
strat <- c(101, 101, 101, 101, 101, 102, 102, 102, 102, 102)

survey.data <- data.frame(wel, rmul, strat)

survey_unweighted <- svydesign(data = survey.data,
                             ids = ~1)

mean_unweighted <- svymean(~wel, survey_unweighted)

survey_strat <- survey_strat <- svydesign(data = survey.data, 
                                          ids= ~wel, 
                                          weights = ~rmul, 
                                          strata = ~strat, 
                                          nest = TRUE)
mean_strat <- svymean(~wel, survey_strat)


coef_one <- coef( mean_unweighted )
coef_two <- coef( mean_strat )
se_one <- SE( mean_unweighted )
se_two <- SE( mean_strat )

t_statistic <- abs( coef_one - coef_two ) / sqrt ( se_one ^2 + se_two ^2 )
p_value <- ( 1 - pnorm( abs( coef_one - coef_two ) / sqrt( se_one ^2 + se_two ^2 ) ) ) * 2
sig_diff <- ifelse( 1 - pnorm( abs( coef_one - coef_two ) / sqrt( se_one ^2 + se_two ^2 ) ) < 0.025 , "*" , "" )

Upvotes: 2

Related Questions