Reputation: 3720
I want to estimate the needed sample size to compute a Chi Squared (Test for homogenity) test for discrete data using Python and need a hint how to do it.
In general I want to estimate if the failure rates of two production processes differ significantly (alpha = 5%) or not.
I have only found the statsmodels.stats.gof.chisquare_effectsize() function but this seems to work only for a goodness of fit test.
Is there any way how I can determine the needed sample size?
I appreciate every answer.
Upvotes: 3
Views: 2713
Reputation: 592
You can use statsmodels.stats.GofChisquarePower().solve_power() However, you need to adjust the degrees of freedom (df) to account for the number of variables. You can accomplish this with the n_bins parameter.
>>>import statsmodels.stats.power as smp
>>>n_levels_variable_a = 2
>>>n_levels_variable_b = 3
>>>smp.GofChisquarePower().solve_power(0.346, power=.8, n_bins=(n_levels_variable_a-1)*(n_levels_variable_b-1), alpha=0.05)
115.94688728433769
Upvotes: 4