Reputation: 423
Here are two ways to do an independent t-test (welch version) in scipy. Both are giving different results for the calculated pvalue as well as the t-statistic itself? Why is this the case?
import scipy
print(scipy.__version__)
# 1.4.1
from scipy.stats import ttest_ind, ttest_ind_from_stats
x1_a = [19.0924, 19.1055, 19.1192, 19.1431, 19.0970]
x1_b = [20.3323, 20.3472, 20.3417, 20.3408, 20.2849]
x1_c = [19.0448, 18.9576, 19.0171, 19.0184, 18.9534]
ttest_ind(x1_a, x1_c, equal_var=False)
# Ttest_indResult(statistic=5.568858312509857, pvalue=0.0014998806395224108)
ttest_ind_from_stats(np.mean(x1_a), np.std(x1_a), 5, np.mean(x1_c), np.std(x1_c), 5, equal_var=False)
# Ttest_indResult(statistic=6.226172871918404, pvalue=0.000844418100098984)
ttest_ind(x1_a, x1_b, equal_var=False)
# Ttest_indResult(statistic=-83.49461195258749, pvalue=1.3516515130741807e-12)
ttest_ind_from_stats(np.mean(x1_a), np.std(x1_a), 5, np.mean(x1_b), np.std(x1_b), 5, equal_var=False)
# Ttest_indResult(statistic=-93.34981404047603, pvalue=5.764760941006529e-13)
The things I tried to rule out possible causes include checking a possible rounding off issue by feeding it np.sqrt(np.var(x))
instead of np.std(x))
, writing a custom testing function using the wikipedia explanation which gives a result similar to ttest_ind_from_stats
, trying multiple values, manually calculating the sds to avoid an n-1
/n
issue and trying to read the documentation for source code but it seems that ttest_ind
uses _ttest_ind_from_stats
internally which raises my confusion. Here is my custom function:
from scipy.stats import t as tdist
def welch_ttest(m1, m2, s1, s2, n1, n2):
numerator = m1 - m2
denominator = np.sqrt((s1 ** 2) / n1 + (s2 ** 2) / n2)
t = numerator / denominator
dof_numerator = ((s1 ** 2) / n1 + (s2 ** 2) / n2) ** 2
dof_denominator = ((s1 ** 4) / (n1 ** 2) / (n1 - 1) + (s2 ** 4) / (n2 ** 2) / (n2 - 1))
dof = dof_numerator / dof_denominator
p_half = tdist.cdf(t, dof)
if p_half > 0.5:
p_final = 2 * (1 - p_half)
else:
p_final = 2 * p_half
return t, p_final # returning t to check the validity of the function
Upvotes: 2
Views: 653
Reputation: 15738
np.std
doesn't perform Bessel's correction. If it is replaced with pandas' version of std, the results match:
ttest_ind(x1_a, x1_c, equal_var=False)
# Ttest_indResult(statistic=5.568858312509857, pvalue=0.0014998806395224108)
ttest_ind_from_stats(np.mean(x1_a), pd.Series(x1_a).std(), 5, np.mean(x1_c), pd.Series(x1_c).std(), 5, equal_var=False)
# Ttest_indResult(statistic=5.568858312509857, pvalue=0.0014998806395224108)
Or, if you don't want an extra import, just apply the correction in place by multiplying std by sqrt(n/n-1)
Upvotes: 3