Reputation: 1857
I have a DataFrame named df
, and I want to get the average usage time
of different apps
by different gender
groups.
import pandas as pd
df=pd.DataFrame({'user':[2,3,4,4,5,5],'gender':[0,0,1,1,1,1],
'app':['k','k','k','k','s','s'],'time':[6,10,10,6,3,1]})
Input:
app gender time user
0 k 0 6 2
1 k 0 10 3
2 k 1 10 4
3 k 1 6 4
4 s 1 3 5
5 s 1 1 5
For app k
, the total time for the gender 0
group to use app k
is 16 (10 + 6 )
, so the average usage time 0_k
is 8.0
.
And the total time for the gender 1
group to use app k
is 16 (10 + 6 + 0 + 0)
, so the average usage time 1_k
is 4.0
.
Expected:
dict = {'0_k': 8.0, '0_s': 0, '1_k': 4.0, '1_s': 1.0}
Upvotes: 0
Views: 53
Reputation: 6168
(df.groupby(["app", "gender"]).sum()/df.groupby(["gender"]).count()).time
app gender
k 0 8.0
1 4.0
s 1 1.0
To trasform it to a dictionary:
dict = (df.groupby(["app", "gender"]).sum()/df.groupby(["gender"]).count()).time.to_dict()
{('k', 0): 8.0, ('s', 1): 2.0, ('k', 1): 8.0}
Upvotes: 0
Reputation: 13255
IIUC I think you need:
df['new_col'] = df.gender.astype(str)+'_'+df.app
df['Average'] = df.groupby(['gender','app'])['time'].transform('sum')/\
df.groupby(['gender'])['time'].transform('count')
print(df)
user gender app time new_col Average
0 2 0 k 6 0_k 8.0
1 3 0 k 10 0_k 8.0
2 4 1 k 10 1_k 4.0
3 4 1 k 6 1_k 4.0
4 5 1 s 3 1_s 1.0
5 5 1 s 1 1_s 1.0
d = dict(df[['new_col','Average']].values)
print(d)
{'0_k': 8.0, '1_k': 4.0, '1_s': 1.0}
Upvotes: 2