Reputation: 380
I have a DataFrame with several ids, every id has a category. My result should contain the category which occurred the most for a certain id.
Example:
id categorie
1 aaa
1 aaa
2 bbb
2 bbb
2 aaa
3 aaa
3 ccc
3 ccc
Result:
id categorie
1 aaa
2 bbb
3 ccc
I tried several .groupby() approaches but none have worked so far.
Upvotes: 0
Views: 49
Reputation: 7604
You can just do this:
df = df.groupby(by=['id'], as_index=False)['categorie'].max()
Or:
df = df.groupby(by=['id'], as_index=False).agg(lambda x:x.value_counts().index[0])
print(df)
id categorie
0 1 aaa
1 2 bbb
2 3 ccc
Upvotes: 1