Reputation: 3379
Say I have the following dataframe:
>>> df=pd.DataFrame([[150,90,60],[200,190,10],[400,150,250]], columns=['Total','Group1','Group2'])
>>> df
Total Group1 Group2
0 150 90 60
1 200 190 10
2 400 150 250
>>>
As you can see, Group 1 and Group 2 sum up to the Total (think age categories in Census Data). I want to calculate the percentage within each group.
Right now I'm doing this as follows:
>>> df2=df.copy()
>>> for Group in ['Group1','Group2']:
... df2[Group]=df[Group]/df['Total']*100
...
>>>
>>> df2
Total Group1 Group2
0 150 60.0 40.0
1 200 95.0 5.0
2 400 37.5 62.5
>>>
However, I'm sure there is a way to do this without the for loop. Perhaps using applymap or map? Can someone show me the more efficient way to do this calculation?
Upvotes: 1
Views: 1851
Reputation: 2137
>>> print df.drop('Total', axis=1).divide(df.Total, axis=0)
Group1 Group2
0 0.600 0.400
1 0.950 0.050
Upvotes: 2
Reputation: 109576
You can just divide as follows:
>>> df.div(df.Total.values, axis=0)
Total Group1 Group2
0 1 0.600 0.400
1 1 0.950 0.050
2 1 0.375 0.625
I wouldn't recommend mixing values and percentages, but if you really want to, you can reassign Total
:
df2 = df.div(df.Total.values, axis=0)
df2['Total'] = df.Total
Upvotes: 2