SeanB
SeanB

Reputation: 25

R: dpois not returning correct probabilities?

I am working with dataset of the number of truffles found in 288 search areas. I am planning to test the null hypothesis that the truffles are distributed randomly, thus I am using dpois() to to calculate the expected probability densities. There are 4 categories (0, 1, 2, or 3 truffles per plot). The expected probabilities will later be converted to expected proportions and incorporated into a chisq.test analysis.

The problem is that the expected probabilities that I get with the following code don't make sense. They should sum to 1, but are much too small. I run the same exact code with another dataset and it produces normal values. What is going on here?

trufflesFound<-c(rep(0,203),rep(1,39),rep(2,18),rep(3,28))
trufflesTable<-table(trufflesFound)
trufflesTable

mean(trufflesTable)

expTruffPois<-dpois(x = 0:3, lambda = mean(trufflesTable)) 
expTruffPois

These are the probabilities it gives me, which are much too low!

0: 0.00000000000000000000000000000005380186

1: 0.00000000000000000000000000000387373404

2: 0.00000000000000000000000000013945442527

3: 0.00000000000000000000000000334690620643

In contrast, this dataset works just fine:

extinctData<-c(rep(1,13),rep(2,15),rep(3,16),rep(4,7),rep(5,10),rep(6,4),7,7,8,9,9,10,11,14,16,16,20)
extinctFreqTable <- table(extinctData)
extinctFreqTable

mean(extinctFreqTable)

expPois <- dpois(x = 0:20, lambda = mean(extinctFreqTable))
expPois

sum(expPois)

The sum is 0.9999997, which is close to the expected value of 1

Thoughts?

Upvotes: 1

Views: 198

Answers (1)

smingerson
smingerson

Reputation: 1438

Lambda should be the average frequency, but taking mean(trufflesTable) returns the average of the counts of frequencies. Use mean(trufflesFound) instead. The reason the second one looks "right" is because mean(extinctData) is relatively close to mean(extinctFreqTable). Note that the probabilities don't sum exactly to 1, because given the mean it is conceivable that we'd observe more than 4 truffles in a future search area.

trufflesFound<-c(rep(0,203),rep(1,39),rep(2,18),rep(3,28))
expTruffPois<-dpois(x = 0:3, lambda = mean(trufflesFound)) 
expTruffPois
#> [1] 0.57574908 0.31786147 0.08774301 0.01614715
sum(expTruffPois)
#> [1] 0.9975007

Created on 2022-02-08 by the reprex package (v2.0.1)

Upvotes: 1

Related Questions