Dan
Dan

Reputation: 13

Getting NAs when attempting to analyse dataset in both the survey and srvyr packages in R?

This is my first post (and I am a real beginner at R), so please go easy on me...

I'm attempting to analyse an Australian Electoral Study dataset in R. It is a survey conducted among a nationally representative sample of voters in Australia following Australian federal elections (surprise).

Like other datasets of its kind, it uses weights to ensure the national population is represented adequately.

When I use either the svryr package, or the survey package to analyse this data in R, it just outputs NAs, instead of the statistics I'm looking for.

For example, when I try to find the % of respondents' answers in variable 1A (see the code at the bottom of the post if you'd like to reproduce this), I get this output:

# A tibble: 5 x 5
  A1           proportion proportion_se total total_se
  <fct>             <dbl>         <dbl> <dbl>    <dbl>
1 A good deal          NA           NaN    NA      NaN
2 Some                 NA           NaN    NA      NaN
3 Not much             NA           NaN    NA      NaN
4 None                 NA           NaN    NA      NaN
5 Item skipped         NA           NaN    NA      NaN

Obviously not ideal.

I don't quite know what I've done wrong, so any help would be terrific. Thank you so much in advance... and apologies for the long block of code (if I knew where I'd gone wrong, I'd only copy that chunk, I promise!) This is my code at the moment:

## getting the gang back together

library(tidyverse)
library(dplyr)
library(ggplot2)
library(srvyr)
library(survey)
library(haven)

download.file("http://legacy.ada.edu.au/ADAData/data/aes_2016_01365.sav", "AES_2016.sav")

aes_2016 <- read_spss("AES_2016.sav")

## cleaning the data.frame such that variables are factors

aes_2016_clean <- aes_2016

for (i in seq_along(aes_2016)) {
  try(aes_2016_clean[[i]] <- as_factor(aes_2016[[i]]))
}

## loading up the survey design in both srvyr and survey using the wt_enrol weights

aes_2016_srvyr <- as_survey_design(aes_2016_clean, ids = 1, weights = wt_enrol)

aes_2016_survey <- svydesign(id = ~1, weights = ~wt_enrol, data = aes_2016_clean)

## attempting to get proportion of respondents' answers to variable 1A in both srvyr and survey

aes_2016_srvyr %>%
  group_by(A1) %>%
  summarize(proportion = survey_mean(),
            total = survey_total())

svymean(~A1, aes_2016_survey)

Upvotes: 1

Views: 402

Answers (1)

dschwilk
dschwilk

Reputation: 356

There are NAs in the data. You must decide how to deal with them. This may not be what you want:

aes_2016_srvyr %>%
  group_by(A1) %>%
  summarize(proportion = survey_mean(na.rm=TRUE),
            total = survey_total(na.rm=TRUE))

##   <fct>             <dbl>         <dbl> <dbl>    <dbl>
## 1 A good deal      0.337        0.0110   911.     30.8
## 2 Some             0.434        0.0119  1175.     35.7
## 3 Not much         0.181        0.0101   489.     29.2
## 4 None             0.0481       0.00649  130.     18.0
## 5 Item skipped     0            0          0       0   

Upvotes: 1

Related Questions