user13419531
user13419531

Reputation:

How to use hypothesis testing to compare groups

I have a data frame with name, type, and Turnover per game. A sample of that df is given below.

Name    Type    Turnover per game
kevin   A       5
duke    B       10
jonas   A       12
angly   A       2
anjelo  B       10
wily    A       4
nick    A       8

What I want to do is implement a hypothesis test to check, Type A players have average less turnovers than Type B players..

What I tried :

Firstly, group by Type:

df.groupby('Type').mean()

But I don't know how to implement a hypothesis test to check the above condition.

Upvotes: 2

Views: 1014

Answers (2)

Stef
Stef

Reputation: 30579

Hypothesis testing can be done with ttest_ind:

import pandas as pd
from scipy import stats

data = {'Name': ['kevin', 'duke', 'jonas', 'angly', 'anjelo', 'wily', 'nick'],
        'Type': ['A', 'B', 'A', 'A', 'B', 'A', 'A'],
        'Turnover': [5, 10, 12, 2, 10, 4, 8]}
df = pd.DataFrame(data)

t,p = stats.ttest_ind(df.Turnover[df.Type.eq('A')], df.Turnover[df.Type.eq('B')], 
                      equal_var=False, alternative='less')

if p < 0.05:
    print('Type A players have average less turnovers than Type B players')
else:
    print('Null hypothesis (equal means) cannot be rejected.')

In your example, the null hypothesis that type A and B players have equal turnovers will be reject and the alternative hypothesis that type A players have average less turnovers than type B player will be accepted. See the section Interpretation in the above linked Wikepedia article for details.

Upvotes: 1

ggaurav
ggaurav

Reputation: 1804

The hypothesis test you have mentioned, if I understand correctly, looks straingtforward.

Get the turnover mean by grouping by 'Type'

df_group_by_type = df.groupby('Type')['Turnover per game'].apply(np.mean)
df_group_by_type

Type
A    6.2 
B    10.0

and then just check the required condition

df_group_by_type['A'] < df_group_by_type['B']
True

Upvotes: 0

Related Questions