jonas
jonas

Reputation: 380

Get value which occurred the most for an id in DataFrame

I have a DataFrame with several ids, every id has a category. My result should contain the category which occurred the most for a certain id.

Example:

id  categorie
1   aaa
1   aaa
2   bbb
2   bbb
2   aaa
3   aaa
3   ccc
3   ccc

Result:

id  categorie
1   aaa
2   bbb
3   ccc

I tried several .groupby() approaches but none have worked so far.

Upvotes: 0

Views: 49

Answers (1)

NYC Coder
NYC Coder

Reputation: 7604

You can just do this:

df = df.groupby(by=['id'], as_index=False)['categorie'].max()

Or:

df = df.groupby(by=['id'], as_index=False).agg(lambda x:x.value_counts().index[0])
print(df)

   id categorie
0   1       aaa
1   2       bbb
2   3       ccc

Upvotes: 1

Related Questions