yegu sanon
yegu sanon

Reputation: 39

t-test on dataframe columns

Here is my dataframe:

       Id  Tell  Number 
0       2   Perhaps 2   
1       1   Yes     6
2       1   No      9
3       2   Yes     4
4       2   Yes     7
5       1   No      8
6       1   Yes     15
7       2   Perhaps 2
8       1   No      6
9       2   Yes     2 
import pandas as pd
from pandas import DataFrame
from scipy import stats

# Creating the dictionary
dic = {'ID': [2,1,1,2,2,1,1,2,1,2], 'Tell': ['Perhaps', 'Yes', 'No', 'Yes','Yes', 'No','Yes', 'Perhaps','No', 'Yes'], 'Number': [3,6,9,4,7,8,15,8,6,13]}

# Creating the dataframe
df = pd.DataFrame(dic)

I want to be able to select columns from my dataframe and conduct an independent t-test. I want the ID column to be the grouping variable and the Number column to be the dependent variable. When I do for example:

ex=stats.ttest_ind(df['ID'],df['Number']) 
print(ex)

It prints p-value=4.116 which doesn't really makes sense. When I use a statistical software like jamovi, it gives me a p-value of 0.478.

Please help.

Upvotes: 0

Views: 784

Answers (1)

Hugolmn
Hugolmn

Reputation: 1560

For me it prints

Ttest_indResult(statistic=-5.379185420933047, pvalue=4.1168498868556256e-05)

Notice the e-05, this is what you may have overlooked.

I tested by an other mean, it found the same result.

Upvotes: 1

Related Questions