How to perform a t-test on data stored in a pandas DataFrame

Question

I have some experimental data. The experiment measured 126 genes over time in three different cell lines with an n=6. The normalized measurement is known as the delta_ct value. The data is stored in a pandas.DataFrame which looks like this:

                       Gene     Group  Time  Repeat  delta_ct
Group    Time Repeat                                         
Adult    0    1       SMAD3     Adult     0       1  0.115350
              2       SMAD3     Adult     0       2  0.076046
              3       SMAD3     Adult     0       3  0.081212
              4       SMAD3     Adult     0       4  0.083205
              5       SMAD3     Adult     0       5  0.101456
              6       SMAD3     Adult     0       6  0.089714
         1    1       SMAD3     Adult     1       1  0.088079
              2       SMAD3     Adult     1       2  0.093965
              3       SMAD3     Adult     1       3  0.114951
              4       SMAD3     Adult     1       4  0.082359
              5       SMAD3     Adult     1       5  0.080788
              6       SMAD3     Adult     1       6  0.103181
Neonatal 24   1       SMAD3  Neonatal    24       1  0.039883
              2       SMAD3  Neonatal    24       2  0.037161
              3       SMAD3  Neonatal    24       3  0.042874
              4       SMAD3  Neonatal    24       4  0.047950
              5       SMAD3  Neonatal    24       5  0.053673
              6       SMAD3  Neonatal    24       6  0.040181
         30   1       SMAD3  Neonatal    30       1  0.035015
              2       SMAD3  Neonatal    30       2  0.042596
              3       SMAD3  Neonatal    30       3  0.038034
              4       SMAD3  Neonatal    30       4  0.040363
              5       SMAD3  Neonatal    30       5  0.034818
              6       SMAD3  Neonatal    30       6  0.031685

Note I kept the columns which created the index as columns because it makes plotting with seaborn a bit easier. My question is, how would I perform a t-test to test the hypothesis that the means for each time point between the different cell lines are significantly different from each other.

For example, in the data above, I want to perform a t-test on df.loc[['Adult',0]] and df.loc[['Neonatal',0]], i.e. the same time point but different cell lines.

How to perform a t-test on data stored in a pandas DataFrame

Answers (1)

Related Questions