Reputation: 103
I'm trying to calculate confidence intervals of proportions using the R survey package function svyciprop with the "likelihood" method.
Here's some sample code:
df <- data.frame(id = c(1, 1, 1, 2, 2, 2), var = c("a", "b", "a", "b", "a", "b"))
survey_design <- svydesign(id = ~id, data = df)
svyciprop(~I(var == "a"), survey_design, method = "likelihood")
This generates the error message:
Error in seq.int(xmin, xmax, length.out = n) : 'from' must be finite
I can find nothing in the package documentation that explains how to get this to work.
Many thanks!
Upvotes: 2
Views: 1078
Reputation: 251
The error is fixed in version 3.31-4. However, the interval is still reported as
> svyciprop(~I(var == "a"), survey_design, method = "likelihood",level=0.95)
2.5% 97.5%
I(var == "a") 0.5 0.0 NA
As Anthony indicates, the problem is that the true confidence interval goes from nearly 0 to nearly 1.
Upvotes: 2
Reputation: 61
The problem in this example is that the denominator degrees of freedom are too small. The code in the survey package ends up calling MASS::confint.glm
to get the interval, but the interval half-width is about 10 (compared to, say, 1.96 for iid sampling and large samples). That means a nominal one-sided tail probability of 1.6e-25 has to be given to MASS::confint.glm
. Unfortunately, MASS::confint.glm
wants this in the form 1-1.6e-25, which is 1 up to machine precision.
You can specify the denominator degrees of freedom using the df
argument: with these data you get a result down to df=2
.
> svyciprop(~I(var == "a"), survey_design, method = "like",df=5)
2.5% 97.5%
I(var == "a") 0.500 0.117 0.88
> svyciprop(~I(var == "a"), survey_design, method = "like",df=4)
2.5% 97.5%
I(var == "a") 0.5000 0.0993 0.9
> svyciprop(~I(var == "a"), survey_design, method = "like",df=3)
2.5% 97.5%
I(var == "a") 0.5000 0.0696 0.93
> svyciprop(~I(var == "a"), survey_design, method = "like",df=2)
2.5% 97.5%
I(var == "a") 0.5000 0.0216 0.98
> svyciprop(~I(var == "a"), survey_design, method = "like",df=1)
Error in seq.int(xmin, xmax, length.out = n) : 'from' must be finite
It's pretty clear that the confidence interval will stretch from nearly 0 to nearly 1
Upvotes: 3
Reputation: 6124
the documentation for svyciprop
is found by typing ?svyciprop
or by googling svyciprop
but the documentation won't cover something as specific as your error.
since all R code is available for users to read, you can debug the function that you are using. debug survey:::svyciprop
which leads you to survey:::confint.svyglm
which leads you to MASS:::confint.glm
which leads you to MASS:::confint.profile.glm
and so on. the internet has lots of explanations of how to use the debug
function in R. there are a lot of moving parts here
you're getting some Inf
values from the glm objects deep in this calculation that's causing it to break. it's probably related to your example data set being a bit too perfect (and unrealistic). ;)
if i throw out a single observation from your df
then it works.
library(survey)
df <- data.frame(id = c(1, 1, 2, 2, 2), var = c("a", "b", "b", "a", "b"))
survey_design <- svydesign(id = ~id, data = df)
svyciprop(~I(var == "a"), survey_design, method = "likelihood")
the example data sets found at the bottom of ?svyciprop
also work.
your particular issue is that the confidence interval you're requesting is impossible.
# watch how the confidence interval tends toward zero and one as you widen it.
svyciprop(~I(var == "a"), survey_design, method = "likelihood",level=0.8)
svyciprop(~I(var == "a"), survey_design, method = "likelihood",level=0.9)
svyciprop(~I(var == "a"), survey_design, method = "likelihood",level=0.93)
svyciprop(~I(var == "a"), survey_design, method = "likelihood",level=0.95) # this is the default
Upvotes: 1