merchmallow
merchmallow

Reputation: 794

Counting occurrences in a dataframe

I have a df like say:

value     id
 0        ABC
 1        ABC
 0        DAX
 0        ABC
 1        YTY

So, value is boolean and id is a string, there's more than 100 distinct strings.

How can I filter or group this df to get the id with the most '0' values, the id with most '1' values and so on...

Upvotes: 0

Views: 58

Answers (2)

SeaBean
SeaBean

Reputation: 23217

You can use .value_counts(), as follows:

df.value_counts()

The default is already with sorting by occurrences in descending order. So you can just refer to the first entry of id under each value to get what you want:

value  id 
0      ABC    2
       DAX    1
1      ABC    1
       YTY    1
dtype: int64

If you want to show only the max id for each value, you can also use:

df.value_counts().reset_index(level=1, name='count').groupby(level=0).first()
        id  count
value            
0      ABC      2
1      ABC      1

Upvotes: 1

BENY
BENY

Reputation: 323226

Try with

pd.crosstab(df.id,df.value).idxmax()

Upvotes: 0

Related Questions