Reputation: 4482
I have a dataframe that looks like this:
import pandas as pd
foo = pd.DataFrame({'id':[1,1,2,2], 'val':[1,1,1,0]})
I would like to create a new column, which will have the percentage of val == 1
by id
The resulting dataframe should look like this:
foo = pd.DataFrame({'id':[1,1,2,2], 'val':[1,1,1,0], 'percentage':[1,1,0.5,0.5})
Any ideas how I can do that ?
Upvotes: 1
Views: 33
Reputation: 862661
If there is only 0,1
values you can use mean
with GroupBy.transform
for new column:
foo['percentage'] = foo.groupby('id')['val'].transform('mean')
print (foo)
id val percentage
0 1 1 1.0
1 1 1 1.0
2 2 1 0.5
3 2 0 0.5
If any values in val
is necessary first compare by Series.eq
:
foo['percentage'] = foo['val'].eq(1).groupby(foo['id']).transform('mean')
Upvotes: 1