Domec
Domec

Reputation: 87

How to divide each values of the group with total size in python?

I have a dataframe with Yes/no answer in the column 'quality','price','time'.

I transformed in 1 and 0 and grouped

grouped = df.group.by(['country'])[['quality','price','time']].sum() to get only the 'Yes' answers and the result is:

country quality price time
FRANCE 5 4 3
GERMANY 3 2 6
UK 2 1 4

I would like to know how to divide each values in groupby for the size(N total) of each country respondents, in my case FRANCE = 9 , GERMANY = 11, UK = 12.

I know that i can select single group and make operations with it: france = country_split.loc[['FRANCE']]

(france/9)*100

but it is possible to make operations for the entire group in one time?

Upvotes: 1

Views: 681

Answers (1)

jezrael
jezrael

Reputation: 862901

Use Series.value_counts for counts and divide values of columns after aggregate sum, then multiple by 100:

#if need dict for counts
#s = {'FRANCE': 9, 'GERMANY': 11, 'UK': 12}

s = df['country'].value_counts()

grouped = df.groupby(['country'])[['quality','price','time']].sum().div(s, axis=0).mul(100)
print (grouped)
           quality      price       time
country                                 
FRANCE   55.555556  44.444444  33.333333
GERMANY  27.272727  18.181818  54.545455
UK       16.666667   8.333333  33.333333

Not tested, but possible solution should be aggregate mean instead sum:

grouped = df.groupby(['country'])[['quality','price','time']].mean()

Upvotes: 1

Related Questions