Reputation: 69
I'm looking for a built-in R function that calculates the power of a one sample hypothesis test for proportions.
The built in function power.prop.test only does TWO SAMPLE hypothesis tests for proportions.
The original question is: "How many times do you have to toss a coin to determine that it is biased?
p.null <- 0.5 # null hypothesis.
We say that a coin is "biased" if the probability of tossing heads is either greater than 0.51 or less than 0.49. Otherwise we say that it is "good enough"
delta <- 0.01
Here is a function to toss a biased coin N times and return the proportion of heads:
biased.coin <- function(delta, N) {
probs <- runif(N, 0, 1)
heads <- probs[probs < 0.5+delta]
return(length(heads)/N)
}
We fix alpha and beta throughout at the standard values. Our goal is to calculate N.
alpha = 0.05 # 95% confidence interval
beta = 0.8 # Correctly reject the null hypothesis 80% of time.
The first step is to use a simulation.
A single experiment is to toss the coin N times and reject the null hypothesis if the number of heads deviates "too far" from the expected value of N/2
We then repeat the experiment M times and count how many times the null hypothesis is (correctly) rejected.
M <- 1000
simulate.power <- function(delta, N, p.null, M, alpha) {
print(paste("Calculating power for N =", N))
reject <- c()
se <- sqrt(p.null*(1-p.null))/sqrt(N)
for (i in (1:M)) {
heads <- biased.coin(delta, N) # perform an experiment
z <- (heads - p.null)/se # z-score
p.value <- pnorm(-abs(z)) # p-value
reject[i] <- p.value < alpha/2 # Do we rejct the null?
}
return(sum(reject)/M) # proportion of time null was rejected.
}
Next we plot a graph (slow, about 5 minutes):
ns <- seq(1000, 50000, by=1000)
my.pwr <- c()
for (i in (1:length(ns))) {
my.pwr[i] <- simulate.power(delta, ns[i], p.null, M, alpha)
}
plot(ns, my.pwr)
From the graph it looks like the N you need for a power of beta = 0.8 is about 20000.
The simulation is very slow so it would be nice to have a built in function.
A little fiddling around gave me this:
magic <- function(p.null, delta, alpha, N) {
magic <-power.prop.test(p1=p.null,
p2=p.null+delta,
sig.level=alpha,
###################################
n=2*N, # mysterious 2
###################################
alternative="two.sided",
strict=FALSE)
return(magic[["power"]])
}
Let's plot it against our simulated data.
pwr.magic <- c()
for (i in (1:length(ns))) {
pwr.magic[i] <- magic(p.null, delta, alpha, ns[i])
}
points(ns, pwr.magic, pch=20)
The fit is good, but I have no idea why I would need to multiply N by two, in order to get a one sample power out of a two sample proportion test.
It would be nice if there were a built in function that let you do one sample directly.
Thanks!
Upvotes: 4
Views: 540
Reputation: 13108
You could try
library(pwr)
h <- ES.h(0.51, 0.5) # Compute effect size h for two proportions
pwr.p.test(h = h, n = NULL, sig.level = 0.05, power = 0.8, alternative = "two.sided")
# proportion power calculation for binomial distribution (arcsine transformation)
# h = 0.02000133
# n = 19619.53
# sig.level = 0.05
# power = 0.8
# alternative = two.sided
As an aside, one way to speed up your simulation significantly would be to use rbinom
instead of runif
:
biased.coin2 <- function(delta, N) {
rbinom(1, N, 0.5 + delta) / N
}
Upvotes: 2