Reputation: 475
I would like to do 2 manupilation with pandas but I cannot manage to do it.
I have a dataframe that looks like this :
df = pd.DataFrame({'Name': {0: 'Eric', 1: 'Mattieu',2: 'Eric',3: 'Mattieu', 4: 'Mattieu',5: 'Eric',6:'Mattieu',7:'Franck',8:'Franck',9:'Jack',10:'Jack'},
'Value': {0: False, 1:False,2:True,3:False, 4:True,5: True,6: False,7:True,8:True,9:False,10:False},
})
df=df.sort_values(["Name"])
df
output:
Name Value
0 Eric False
2 Eric True
5 Eric True
7 Franck True
8 Franck True
9 Jack False
10 Jack False
1 Mattieu False
3 Mattieu False
4 Mattieu True
6 Mattieu False
Manupilation 1 : I would like to have the Number of True , False, Total value and the mean of True Value for each Name, like this :
Name Nbr True Nbr False Total Value Mean (True/(False+True))
0 Eric 2 1 3 0.75
1 Franck 2 0 2 1.00
2 Jack 0 2 2 0.00
3 Mattieu 1 3 4 0.25
Manupilation 2: I would like to get a group by mean of the column "Mean (True/(False+True))" grouped by "Total value", like this :
Group by Total Value Mean of grouped Total Value
0 2 0.50
1 3 0.75
2 4 0.25
Thanks in advance for your help
Upvotes: 1
Views: 417
Reputation: 323316
First one can be done with crosstab
s1 = pd.crosstab(df['Name'], df['Value'], margins=True).drop('All').assign(Mean = lambda x : x[True]/x['All'])
Out[266]:
Value False True All Mean
Name
Eric 1 2 3 0.666667
Franck 0 2 2 1.000000
Jack 2 0 2 0.000000
Mattieu 3 1 4 0.250000
2nd dataframe do with groupby
s2 = s1.groupby('All').apply(lambda x : sum(x[True])/sum(x['All'])).reset_index(name='Mean of ALL')
Out[274]:
All Mean of ALL
0 2 0.500000
1 3 0.666667
2 4 0.250000
Upvotes: 3