How Groupby value counts pandas dataframe?

Question

this is my dataframe

df = pd.DataFrame([
    ('a', 0, 0),
    ('b', 1, 1),
    ('c', 1, 0),
    ('d', 2, 1),
    ('e', 2, 1)
], columns=['name', 'cluster', 'is_selected'])

i want to count each letter selected in each cluster and group by cluster. i tried this : df.groupby('cluster')['is_selected'].value_counts() and i get this output :

cluster  is_selected
0        0              1
1        0              1
         1              1
2        1              2
Name: is_selected, dtype: int64

but what i want is this format:

cluster  count_selected
0        1        
1        1             
2        2

please how can i fix it?

Gorlomi · Accepted Answer

Based on your explanation you want to count the letters that are selected (value of 1 in is_selected) grouped by clusters.

if that's what you're looking for then this should help:

df[df.is_selected == 1].groupby(['cluster'])['name'].count().reset_index(name='count_selected')

The output is a little different but then again I'm not entirely sure what would cause your cluster 0 to have a count of 1 in your expected output, so i hope this is it!

output:

    cluster count_selected
0   1       1
1   2       2

How Groupby value counts pandas dataframe?

Answers (2)

Related Questions