André Rabie
André Rabie

Reputation: 11

Is there an R function for a multiple group goodness of fit chi squared test?

S.giganteus <- matrix(c(0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 1, 6, 0, 0, 0, 0, 0, 2, 2, 0, 0, 0, 0, 0, 1, 7, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 6, 6, 0, 0, 0, 0, 0, 0, 2, 3, 0, 0, 0, 0, 0, 0, 1, 4, 0, 0, 0, 0, 0, 0, 4, 2, 0, 0, 0, 0, 0, 1, 4, 1, 0, 0), ncol = 7, byrow = T)
colnames (S.giganteus) <- c("s1", "s2", "s3", "s4", "s5", "s6", "s7")
rownames (S.giganteus) <- c("feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec", "jan")
S.giganteus <- as.table(S.giganteus)

P.melanotus <- matrix(c(0, 0, 0, 0, 4, 1, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 4, 7, 6, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 7, 2, 0, 0, 0, 0, 0, 1, 2, 1, 0, 0, 0, 0, 4, 1, 1, 0, 0, 0, 0, 0, 3, 5, 0, 0, 0, 0, 0, 3, 8, 1, 0, 0, 0, 0, 0, 3, 1, 0, 0, 0))
colnames (P.melanotus) <- c("s1", "s2", "s3", "s4", "s5", "s6", "s7")
rownames (P.melanotus) <- c("feb", "mar", "apr", "may", "jun", "jul", "aug", "sep", "oct", "nov", "dec", "jan")
P.melanotus <- as.table(P.melanotus)

I do not know if this is even possible but I would like to do a multinomial goodness of fit chi-squared test. In other words, I would e.g. like to compare the proportions of S.giganteus individuals at different stages in a month to the actual number of individuals from P.melanotus. I can't do a normal Chi-squared test because of differences in sampling effort and therefore unequal group sizes. I would like to avoid testing each month individually to reduce my chances of a type I error. I have no clue where to start or even if this is possible because I can't find anything online. I'm also open to other options of course.

Upvotes: 1

Views: 260

Answers (1)

Kat
Kat

Reputation: 18714

I'm not sure what the objective is with this content, but this is what I found. I do want to say that I'm not sure what you mean about uneven group sizes because both come to 12 x 7.

S.giganteus <- matrix(c(0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 1, 6, 0, 0, 0, 
                        0, 0, 2, 2, 0, 0, 0, 0, 0, 1, 7, 0, 0, 0, 0, 0,
                        0, 6, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0, 0, 0, 6, 
                        6, 0, 0, 0, 0, 0, 0, 2, 3, 0, 0, 0, 0, 0, 0, 1, 
                        4, 0, 0, 0, 0, 0, 0, 4, 2, 0, 0, 0, 0, 0, 1, 4, 
                         1, 0, 0), ncol = 7, byrow = T)
colnames (S.giganteus) <- c("s1", "s2", "s3", "s4", "s5", "s6", "s7")
rownames (S.giganteus) <- c("feb", "mar", "apr", "may", "jun", "JUL", 
                            "aug", "sep", "oct", "nov", "dec", "jan")
(S.giganteus <- as.table(S.giganteus))

P.melanotus <- matrix(c(0, 0, 0, 0, 4, 1, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 
                        0, 0, 0, 4, 7, 6, 0, 0, 0, 0, 0, 0, 8, 0, 0, 0, 
                        0, 0, 0, 5, 0, 0, 0, 0, 0, 0, 7, 2, 0, 0, 0, 0, 
                        0, 1, 2, 1, 0, 0, 0, 0, 4, 1, 1, 0, 0, 0, 0, 0, 
                        3, 5, 0, 0, 0, 0, 0, 3, 8, 1, 0, 0, 0, 0, 0, 3, 
                        1, 0, 0, 0), ncol = 7, byrow = T)
colnames (P.melanotus) <- c("s1", "s2", "s3", "s4", "s5", "s6", "s7")
rownames (P.melanotus) <- c("feb", "mar", "apr", "may", "jun", "JUL", 
                            "aug", "sep", "oct", "nov", "dec", "jan")
(P.melanotus <- as.table(P.melanotus))

Then I applied Chi-squared-

chisq.test(S.giganteus, P.giganteus)
# 
#   Pearson's Chi-squared test
# 
# data:  S.giganteus
# X-squared = 212.12, df = 66, p-value < 2.2e-16
#  

Due to the warning–

Warning message: In chisq.test(S.giganteus, P.giganteus) : Chi-squared approximation may be incorrect

I used MC simulation:

chisq.test(S.giganteus, P.giganteus, simulate.p.value = T)
# 
#   Pearson's Chi-squared test with simulated p-value (based on 2000
#   replicates)
# 
# data:  S.giganteus
# X-squared = 212.12, df = NA, p-value = 0.0004998
# 

Upvotes: 0

Related Questions