Aceconhielo
Aceconhielo

Reputation: 3636

Python Pandas - New column grouping and mode

I have the next dataframe

A | B | C
---------
1 | 22 | 12
2 | 22 | 5
2 | 22 | 5
3 | 23 | 6

I want to add a new column called D to this dataframe. The value of D should be the most repeated value of C (mode) grouping by A and B.

I try with this

def mode(x):
    return mstats.mode(x, axis=None)[0]

df_total['D'] = df_total.groupby(['A','B']).agg({'C': mode})

but i have the next error

TypeError: incompatible index of inserted column with frame index

Any ideas how to fix this ?

Thanks everyone!

Upvotes: 2

Views: 990

Answers (1)

jpp
jpp

Reputation: 164623

You can use groupby with pd.Series.mode. The difficulty is that pd.Series.mode returns a series rather than a scalar. It's not considered a "reducing" function. Therefore, you must extract the first value of the series.

Data from @gyoza.

df['D'] = df.groupby(['A', 'B'])['C'].transform(lambda x: x.mode().iloc[0])

print(df)

   A   B   C   D
0  1  22  12  12
1  2  22   5   5
2  2  22   5   5
3  2  22   3   5
4  3  23   6   6

Upvotes: 1

Related Questions