Andrew Davidson
Andrew Davidson

Reputation: 69

one sample hypothesis test for proportions

I'm looking for a built-in R function that calculates the power of a one sample hypothesis test for proportions.

The built in function power.prop.test only does TWO SAMPLE hypothesis tests for proportions.

The original question is: "How many times do you have to toss a coin to determine that it is biased?

p.null <- 0.5            # null hypothesis.

We say that a coin is "biased" if the probability of tossing heads is either greater than 0.51 or less than 0.49. Otherwise we say that it is "good enough"

delta <- 0.01       

Here is a function to toss a biased coin N times and return the proportion of heads:

biased.coin <- function(delta, N) {
  probs <- runif(N, 0, 1) 
  heads <- probs[probs < 0.5+delta]      
  return(length(heads)/N)
}

We fix alpha and beta throughout at the standard values. Our goal is to calculate N.

alpha = 0.05        # 95% confidence interval
beta = 0.8          # Correctly reject the null hypothesis 80% of time.

The first step is to use a simulation.

A single experiment is to toss the coin N times and reject the null hypothesis if the number of heads deviates "too far" from the expected value of N/2

We then repeat the experiment M times and count how many times the null hypothesis is (correctly) rejected.

M <- 1000

simulate.power <- function(delta, N, p.null, M, alpha) {
   print(paste("Calculating power for N =", N))
   reject <- c()
   se <- sqrt(p.null*(1-p.null))/sqrt(N)   
   for (i in (1:M)) {
       heads <- biased.coin(delta, N)       # perform an experiment
       z <- (heads - p.null)/se             # z-score 
       p.value <- pnorm(-abs(z))            # p-value
       reject[i] <- p.value < alpha/2       # Do we rejct the null? 
   }
   return(sum(reject)/M)        # proportion of time null was rejected.
  }

Next we plot a graph (slow, about 5 minutes):

ns <- seq(1000, 50000, by=1000)
my.pwr <- c()
for (i in (1:length(ns))) {
  my.pwr[i] <- simulate.power(delta, ns[i], p.null, M, alpha)
}
plot(ns, my.pwr)

From the graph it looks like the N you need for a power of beta = 0.8 is about 20000.

The simulation is very slow so it would be nice to have a built in function.

A little fiddling around gave me this:

magic <- function(p.null, delta, alpha, N) {
  magic <-power.prop.test(p1=p.null, 
                          p2=p.null+delta,
                          sig.level=alpha,

                          ###################################
                          n=2*N,             # mysterious 2
                          ###################################

                          alternative="two.sided",
                          strict=FALSE)
  return(magic[["power"]])                 
}

Let's plot it against our simulated data.

pwr.magic <- c()
for (i in (1:length(ns))) {
   pwr.magic[i] <- magic(p.null, delta, alpha, ns[i]) 
}
points(ns, pwr.magic, pch=20)

The fit is good, but I have no idea why I would need to multiply N by two, in order to get a one sample power out of a two sample proportion test.

It would be nice if there were a built in function that let you do one sample directly.

Thanks!

Upvotes: 4

Views: 540

Answers (1)

Weihuang Wong
Weihuang Wong

Reputation: 13108

You could try

library(pwr)
h <- ES.h(0.51, 0.5) # Compute effect size h for two proportions
pwr.p.test(h = h, n = NULL, sig.level = 0.05, power = 0.8, alternative = "two.sided")
# proportion power calculation for binomial distribution (arcsine transformation) 

#           h = 0.02000133
#           n = 19619.53
#   sig.level = 0.05
#       power = 0.8
# alternative = two.sided

As an aside, one way to speed up your simulation significantly would be to use rbinom instead of runif:

biased.coin2 <- function(delta, N) {
  rbinom(1, N, 0.5 + delta) / N
}

Upvotes: 2

Related Questions