Proportion vs contingency chi-square tests giving different p-values in Python

Question

I have found different methods of conducting a chi-square test for A/B testing looking at users vs conversion rate of a control and a test group.

The first method uses statsmodels and uses proportions_chisquare

The second method uses scipy and chi2_contingency

It seems that chi2_contingency always has a higher value that proportions. Any idea for the difference and which test is more applicable for a simple A/B test?

I apologize for not including an example here is one below:

Example1 (p-value = 0.037):

import statsmodels.stats.proportion as proportion
import numpy as np

conv_a = 20
conv_b = 35
clicks_a = 500
clicks_b = 500
converted = np.array([conv_a, conv_b])
clicks = np.array([clicks_a,clicks_b])

chisq, pvalue, table = proportion.proportions_chisquare(converted, clicks)
print('Results are ','chisq =%.3f, pvalue = %.3f'%(chisq, pvalue))

Example 2 (p-value = 0.0521):

import numpy
import scipy.stats

control_size = 500
A_CONVERSIONS = 20
A_NO_CONVERSIONS= control_size - A_CONVERSIONS
test_size = 500
B_CONVERSIONS = 35
B_NO_CONVERSIONS = test_size - B_CONVERSIONS

data = numpy.array([[A_NO_CONVERSIONS, A_CONVERSIONS],
                    [B_NO_CONVERSIONS, B_CONVERSIONS]])

chi_square, p_value = scipy.stats.chi2_contingency(data)[:2]

print('χ²: %.4f' % chi_square)
print('p-value: %.4f' % p_value)

Proportion vs contingency chi-square tests giving different p-values in Python

Answers (1)

Related Questions