Reputation: 97
I have 3 columns - _a, _b, _c.
import numpy as np
import pandas as pd
df = pd.DataFrame({'_a':[1,1,1,2,2,3,3],'_b':[3,3,5,3,7,3,9], '_c':[10,11,12,13,14,15,16], 'a_b_3:[21,21,21,13,13,15,15]'})
df
_a _b _c a_b_3
0 1 3 10 21
1 1 3 11 21
2 1 5 12 21
3 2 3 13 13
4 2 7 14 13
5 3 3 15 15
6 3 9 16 15
I need create column a_b_3 (sum all values _c for _b=3 by _a) use groupby from pandas. Thank you in advance.
Upvotes: 0
Views: 57
Reputation: 153460
Use:
df['a_b_3'] = df['_a'].map(df[df['_b'] == 3].groupby('_a')['_c'].sum())
Output:
_a _b _c a_b_3
0 1 3 10 21
1 1 3 11 21
2 1 5 12 21
3 2 3 13 13
4 2 7 14 13
5 3 3 15 15
6 3 9 16 15
Explanation
First filter down to only records that have _b equal to 3, then group by _a and sum to create a series. Use that series to map back to _a values in the original dataframe.
Upvotes: 1