SpecialK201
SpecialK201

Reputation: 119

RStudio goes unresponsive when using srvyr::as_survey_rep to take bootstrap samples from my data. What can I do?

I have a dataset with survey data from about 40,000 respondents (and about 600 variables), and I want to calculate statistics (mean, median, etc.) and their variance, including among subgroups like voters for specific political parties in a given country. In the survey documentation it is specified that I need to take into account not just demographic weights, but also stratification and clustering.

From what I have found, R packages like survey and srvyr are suitable for this purpose. Since the responses to the items I'm interested in are often skewed towards one extreme response (such as strongly agree/strongly disagree) and not anywhere near normally distributed, I think that using the bootstrap estimation procedure is appropriate.

I wrote the following code to take my bootstrap samples from a dataframe called ess18, in order to then calculate variances. The variables are anweight for demographic weights, psu for the primary sampling unit/cluster, and stratum for the strata. The function wouldn't run without me specifying that I do not want a finite population correction. First I generate the survey design object, then I want to generate the samples including bootstrap weights.

    ess18design <- as_survey_design(ess18,
                                    weights=anweight,
                                    ids=psu,
                                    strata=stratum,
                                    fpc=NULL)
    
    ess18bootdesign <- as_survey_rep(ess18design, type="bootstrap", replicates=10000)

The problem is that in this form, RStudio starts to get busy but never finishes the task, as it eventually becomes unresponsive. Lowering the number of replicates helps, up to 5000 the process eventually finishes. But at 10000 it reliably does not.

I'm using a 2016 MacBook Pro, admittedly not the most powerful machine, but I didn't expect it to be overwhelmed with the task of taking bootstrap samples. I know I don't have a reproducible error, my question is mainly whether there is anything I can do to complete the task on my platform, or whether I apparently ran into a hardware limitation?

Upvotes: 0

Views: 33

Answers (0)

Related Questions