Reputation: 3
Hi suppose I have the df_test as below:
df_test = pd.DataFrame(dict(A=['bond1','bond1', 'bond2', 'bond2', 'bond3'],
B=[-1, 1, 2, 3, 4]))
I want to create a column C where if B < 0, it's the mean value of B groupby A, if B > 0, it's B. The method I can think of is:
group_mean = df_test.groupby('A')['B'].mean().reset_index()
df_merge = df_test.merge(group_mean, on=['A'], how='left')
df_merge['C'] = df_merge.apply(lambda x: x['B_y'] if x['B_x'] <0 else x['B_x'], axis=1)
Wondering if it's possible to do it in one line, something like:
df_test['C'] = df_test.groupby('B').transform(...)
Thanks for your help
Upvotes: 0
Views: 168
Reputation: 153460
IIUC:
df_test['C'] = df_test.groupby('A')['B'].transform('mean').mask(df['B']>0, df_test['B'])
Output:
A B C
0 bond1 -1 0.0
1 bond1 1 1.0
2 bond2 2 2.5
3 bond2 3 2.5
4 bond3 4 4.0
Upvotes: 2