Using t.test_ind on specific columns of pandas dataframe

Question

I have a pandas dataframe that is the format:

Variable    A1  A2  B1  B2  C1  C2  D1  D2

X   2   3   5   6   13  12  3   3

Y   1   1   7   9   16  19  11  9

Z   3   4   6   6   2   3   53  48

Where A1-A2, B1-B2, etc are replicate measurements and X, Y, Z are the different variables being measure.

I would like to do a t-test between D1-D2 and B1-B2 for each row and then append a new column with the p-values from each comparison.

Desired result would be:

Variable    A1  A2  B1  B2  C1  C2  D1  D2  p-val

X   2   3   5   6   13  12  3   3   0.0345

Y   1   1   7   9   16  19  11  9   0.111

Z   3   4   6   6   2   3   53  48  0.0004

Thank you in advance.

sacuL · Accepted Answer

I've got different results (I can't guess how you're doing your T-Test), but you can use scipy.stats.ttest_ind to do a t-test on independent variables, and extract the p-values from the result (the first index of the output, see linked doc for details):

from scipy.stats import ttest_ind

df['p-val'] = ttest_ind(df[['B1', 'B2']], df[['D1', 'D2']], axis=1)[1]

>>> df
  Variable  A1  A2  B1  B2  C1  C2  D1  D2     p-val
0        X   2   3   5   6  13  12   3   3  0.037750
1        Y   1   1   7   9  16  19  11   9  0.292893
2        Z   3   4   6   6   2   3  53  48  0.003141

Using t.test_ind on specific columns of pandas dataframe

Answers (2)

Related Questions