Reputation: 794
I have a df like say:
value id
0 ABC
1 ABC
0 DAX
0 ABC
1 YTY
So, value
is boolean and id
is a string, there's more than 100 distinct strings.
How can I filter or group this df to get the id with the most '0' values, the id with most '1' values and so on...
Upvotes: 0
Views: 58
Reputation: 23217
You can use .value_counts()
, as follows:
df.value_counts()
The default is already with sorting by occurrences in descending order. So you can just refer to the first entry of id
under each value
to get what you want:
value id
0 ABC 2
DAX 1
1 ABC 1
YTY 1
dtype: int64
If you want to show only the max id
for each value
, you can also use:
df.value_counts().reset_index(level=1, name='count').groupby(level=0).first()
id count
value
0 ABC 2
1 ABC 1
Upvotes: 1