Reputation: 1954
Given some data about dishes, the location of the restaurant and its sales:
>>> import pandas
>>> df1 = pandas.DataFrame({"dish" : ["fish", "chicken", "fish", "chicken", "chicken"],
... "location" : ["central", "central", "north", "north", "south"],
... "sales" : [1,3,5,2,4]})
>>> df1
dish location sales
0 fish central 1
1 chicken central 3
2 fish north 5
3 chicken north 2
4 chicken south 4
>>> df2 = df1[["dish", "location"]]
>>> df2["sales_contrib"] = 0.0
>>> df2
dish location sales_contrib
0 fish central 0.0
1 chicken central 0.0
2 fish north 0.0
3 chicken north 0.0
4 chicken south 0.0
Right now, I would like to do the following:
df2
The resultant df is
dish location sales_contrib
0 fish central 16.67
1 chicken central 33.33
2 fish north 83.33
3 chicken north 22.22
4 chicken south 44.45
I tried using iteritems()
but could not get results.
Upvotes: 0
Views: 60
Reputation: 38415
Try
(df1.groupby(['dish', 'location']).sales.sum().div(df1.groupby('dish').sales.sum()) * 100).round(2).reset_index()
dish location sales
0 chicken central 33.33
1 chicken north 22.22
2 chicken south 44.44
3 fish central 16.67
4 fish north 83.33
Upvotes: 3
Reputation: 2203
You can use the power of Pandas to do this...
dish_totals = df1.groupby(by="dish").sum()
df2["sales_contrib"] = df1.apply((lambda row: 100*row["sales"]/dish_totals.loc[row["dish"]]), axis=1)
print(df2)
Output:
dish location sales_contrib
0 fish central 16.666667
1 chicken central 33.333333
2 fish north 83.333333
3 chicken north 22.222222
4 chicken south 44.444444
Upvotes: 4