Reputation: 3636
I have the next dataframe
A | B | C
---------
1 | 22 | 12
2 | 22 | 5
2 | 22 | 5
3 | 23 | 6
I want to add a new column called D to this dataframe. The value of D should be the most repeated value of C (mode) grouping by A and B.
I try with this
def mode(x):
return mstats.mode(x, axis=None)[0]
df_total['D'] = df_total.groupby(['A','B']).agg({'C': mode})
but i have the next error
TypeError: incompatible index of inserted column with frame index
Any ideas how to fix this ?
Thanks everyone!
Upvotes: 2
Views: 990
Reputation: 164623
You can use groupby
with pd.Series.mode
. The difficulty is that pd.Series.mode
returns a series rather than a scalar. It's not considered a "reducing" function. Therefore, you must extract the first value of the series.
Data from @gyoza.
df['D'] = df.groupby(['A', 'B'])['C'].transform(lambda x: x.mode().iloc[0])
print(df)
A B C D
0 1 22 12 12
1 2 22 5 5
2 2 22 5 5
3 2 22 3 5
4 3 23 6 6
Upvotes: 1