Reputation: 43
I am working with NSCH data for the first time in R. I found the following resource, which has been enormously helpful (thank you!).
https://github.com/ajdamico/asdfree/blob/master/nsch.Rmd
I followed the above code from the github repository completely for importing, setting up, and reshaping the data and creating the survey design. It ran completely fine. I only began to make modifications once I started to try to calculate summary statistics for other variables. Although the variables in the examples generate summary statistics (e.g., age, poverty level), I am producing NAs for the other variables I attempt to generate summary statistics for.
I attempted to replicate the example code below to get summary statistics on one of the numeric variables, ace5.
#Calculate the mean (average) of a linear variable, overall and by groups:
#example code below
MIcombine( with( nsch_design , svymean( ~ sc_age_years ) ) )
#My code
MIcombine( with( nsch_design , svymean( ~ ace5 ) ) )
Age calculates fine:
results
<dbl>
se
<dbl>
sc_age_years 8.839863 0.04435024
However, I get NA values for ace5.
results
<dbl>
se
<dbl>
ace5 NA NA
I also tried converting ace5 to a factor variable and then calculating survey totals:
nsch_design <-
update(
nsch_design ,
ace5f = factor(
ifelse(ace5 == 1, "Yes", ifelse(ace5 == 2, "No", NA)),
levels = c("Yes", "No")
)
)
MIcombine( with( nsch_design , svytotal( ~ ace5f ) ) )
results
<dbl>
se
<dbl>
ace5fYes NA NaN
ace5fNo NA NaN
That syntax also produces NAs. I have tried the syntax with some of the other numeric variables in the dataset (e.g., cavities), and am still producing NAs.
Does anyone have any ideas why I would be getting NA values when trying to compute summary statistics for these variables?
(I am not sure how to generate a minimally reproducible example with a large, complex, multiply imputed dataset, but open to suggestions).
Upvotes: 1
Views: 42
Reputation: 6114
does the na.rm
option shown on the ?svymean
help page give you the behavior you're looking for? thanks!!
# fails
MIcombine( with( nsch_design , svymean( ~ ace5 ) ) )
# works but gives wrong answer since it's averaging ones and twos
MIcombine( with( nsch_design , svymean( ~ ace5 , na.rm = TRUE ) ) )
# works
MIcombine( with( nsch_design , svymean( ~ factor( ace5 ) , na.rm = TRUE ) ) )
# works
MIcombine( with( nsch_design , svymean( ~ as.numeric( ace5 == 1 ) , na.rm = TRUE ) ) )
# works
MIcombine( with( subset( nsch_design , !is.na( ace5 ) ) , svymean( ~ as.numeric( ace5 == 1 ) ) ) )
# works but wrong! incorrectly includes missings in the denominator
MIcombine( with( nsch_design , svymean( ~ as.numeric( ace5 %in% 1 ) ) ) )
Upvotes: 1