Reputation:
I have a data frame with name, type, and Turnover per game. A sample of that df is given below.
Name Type Turnover per game
kevin A 5
duke B 10
jonas A 12
angly A 2
anjelo B 10
wily A 4
nick A 8
What I want to do is implement a hypothesis test to check, Type A players have average less turnovers than Type B players..
What I tried :
Firstly, group by Type:
df.groupby('Type').mean()
But I don't know how to implement a hypothesis test to check the above condition.
Upvotes: 2
Views: 1014
Reputation: 30579
Hypothesis testing can be done with ttest_ind:
import pandas as pd
from scipy import stats
data = {'Name': ['kevin', 'duke', 'jonas', 'angly', 'anjelo', 'wily', 'nick'],
'Type': ['A', 'B', 'A', 'A', 'B', 'A', 'A'],
'Turnover': [5, 10, 12, 2, 10, 4, 8]}
df = pd.DataFrame(data)
t,p = stats.ttest_ind(df.Turnover[df.Type.eq('A')], df.Turnover[df.Type.eq('B')],
equal_var=False, alternative='less')
if p < 0.05:
print('Type A players have average less turnovers than Type B players')
else:
print('Null hypothesis (equal means) cannot be rejected.')
In your example, the null hypothesis that type A
and B
players have equal turnovers will be reject and the alternative hypothesis that type A
players have average less turnovers than type B
player will be accepted. See the section Interpretation in the above linked Wikepedia article for details.
Upvotes: 1
Reputation: 1804
The hypothesis test you have mentioned, if I understand correctly, looks straingtforward.
Get the turnover mean by grouping by 'Type'
df_group_by_type = df.groupby('Type')['Turnover per game'].apply(np.mean)
df_group_by_type
Type
A 6.2
B 10.0
and then just check the required condition
df_group_by_type['A'] < df_group_by_type['B']
True
Upvotes: 0