Reputation: 1998
I've a question about the hypergeometric test.
I've data like this :
pop size : 5260
sample size : 131
Number of items in the pop that are classified as successes : 1998
Number of items in the sample that are classified as successes : 62
To compute a hypergeometric test, is that correct?
phyper(62, 1998, 5260, 131)
Upvotes: 27
Views: 60356
Reputation: 139
@Albert,
To compute a hypergeometric test, you obtain the same p-value, P(observed 62 or more), using:
> phyper(62-1, 1998, 5260-1998, 131, lower.tail=FALSE)
[1] 0.01697598
Because:
lower.tail: logical; if TRUE (default), probabilities are P[X <= x],
otherwise, P[X > x]
Upvotes: 14
Reputation: 66834
Almost correct. If you look at ?phyper
:
phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)
x, q vector of quantiles representing the number of white balls drawn
without replacement from an urn which contains both black and white
balls.
m the number of white balls in the urn.
n the number of black balls in the urn.
k the number of balls drawn from the urn.
So using your data:
phyper(62,1998,5260-1998,131)
[1] 0.989247
Upvotes: 25
Reputation: 1
I think this test be should be like following:
phyper(62,1998,5260-1998,131-62,lower.tail=FALSE)
Then the sum of all the rows will equal the sum of all the columns. This is important when dealing with contingency tables.
Upvotes: 0
Reputation: 211
I think you want to compute p-value. In this case, you want
P(Observed 62 or more) = 1-P(Observed less than 62).
So you want
1.0-phyper(62-1, 1998, 5260-1998, 131)
Note that -1
there in the first parameters. And also you need to subtract that from 1.0 to get the area of the right tail.
Correct me if I'm wrong..
Upvotes: 21