Nicolas Rosewick
Nicolas Rosewick

Reputation: 1998

Hypergeometric test (phyper)

I've a question about the hypergeometric test.

I've data like this :

pop size : 5260
sample size : 131
Number of items in the pop that are classified as successes : 1998
Number of items in the sample that are classified as successes : 62

To compute a hypergeometric test, is that correct?

phyper(62, 1998, 5260, 131)

Upvotes: 27

Views: 60356

Answers (4)

@Albert,

To compute a hypergeometric test, you obtain the same p-value, P(observed 62 or more), using:

> phyper(62-1, 1998, 5260-1998, 131, lower.tail=FALSE)
[1] 0.01697598

Because:

lower.tail: logical; if TRUE (default), probabilities are P[X <= x], 
            otherwise, P[X > x]

Upvotes: 14

James
James

Reputation: 66834

Almost correct. If you look at ?phyper:

phyper(q, m, n, k, lower.tail = TRUE, log.p = FALSE)

x, q vector of quantiles representing the number of white balls drawn
without replacement from an urn which contains both black and white
balls.

m the number of white balls in the urn.

n the number of black balls in the urn.

k the number of balls drawn from the urn.

So using your data:

phyper(62,1998,5260-1998,131)
[1] 0.989247

Upvotes: 25

user5531047
user5531047

Reputation: 1

I think this test be should be like following:

phyper(62,1998,5260-1998,131-62,lower.tail=FALSE)

Then the sum of all the rows will equal the sum of all the columns. This is important when dealing with contingency tables.

Upvotes: 0

Albert
Albert

Reputation: 211

I think you want to compute p-value. In this case, you want

P(Observed 62 or more) = 1-P(Observed less than 62).

So you want

1.0-phyper(62-1, 1998, 5260-1998, 131)

Note that -1 there in the first parameters. And also you need to subtract that from 1.0 to get the area of the right tail.

Correct me if I'm wrong..

Upvotes: 21

Related Questions