Reputation: 77
Let's say I have one group with a categorical variable with different frequencies of the levels, but the data is also clustered with unequal clusters - I am interested in whether some of the variable level frequencies are statistically significantly different or not (but I also want to somehow adjust for clustering) - how do I do this ? The following small example is just for explanation (I have a lot of data): e.g.
var <- c("a", "a", "a", "b", "b", "c", "c", "c", "c", "d", "d", "e", "e", "e", "e", "e"); clusters <- c(1, 1, 2, 2, 3, 3, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12)
I presume this is some sort of goodness-of-fit test, perhaps R aods3::gof or aodml ?? but I'm not sure
(a) which R function to use and
(b) the model specification
as all the examples given, e.g. for gof, compare two groups and have a two-sided formula with group on one side - whereas I presume that my formula would have ~ 1 ??? Without the clustering I could, as an approximation, simply create var2 with the same levels but near-equal frequencies - and then run chi-squared - but in an approximation procedure I don't know what to do with the clusters.
Upvotes: 1
Views: 46
Reputation: 77
o.k. - so I found a way to install R [htestClust]: install gfortran-10.2-Catalina into Mac Catalina; in R install.packages("githubinstall"); githubinstall("bootstrap"); install.packages("htestClust"). Then use chisqtestClust with var and clusters.
Upvotes: 0