Reputation: 39
Here is my dataframe:
Id Tell Number
0 2 Perhaps 2
1 1 Yes 6
2 1 No 9
3 2 Yes 4
4 2 Yes 7
5 1 No 8
6 1 Yes 15
7 2 Perhaps 2
8 1 No 6
9 2 Yes 2
import pandas as pd
from pandas import DataFrame
from scipy import stats
# Creating the dictionary
dic = {'ID': [2,1,1,2,2,1,1,2,1,2], 'Tell': ['Perhaps', 'Yes', 'No', 'Yes','Yes', 'No','Yes', 'Perhaps','No', 'Yes'], 'Number': [3,6,9,4,7,8,15,8,6,13]}
# Creating the dataframe
df = pd.DataFrame(dic)
I want to be able to select columns from my dataframe and conduct an independent t-test. I want the ID column to be the grouping variable and the Number column to be the dependent variable. When I do for example:
ex=stats.ttest_ind(df['ID'],df['Number'])
print(ex)
It prints p-value=4.116 which doesn't really makes sense. When I use a statistical software like jamovi, it gives me a p-value of 0.478.
Please help.
Upvotes: 0
Views: 784
Reputation: 1560
For me it prints
Ttest_indResult(statistic=-5.379185420933047, pvalue=4.1168498868556256e-05)
Notice the e-05, this is what you may have overlooked.
I tested by an other mean, it found the same result.
Upvotes: 1