Reputation: 31
I have a pandas dataframe that looks like this:
name | category | status |
---|---|---|
John | student | yes |
Jane | employee | no |
Elijah | student | no |
Anne | student | yes |
Elle | employee | no |
I want to count the number of each categories that have status 'yes'
I have tried 2 codes below:
(DataFrame['status'].eq('yes').groupby(DataFrame['category']).nunique())
(DataFrame['status'].eq('yes').groupby(DataFrame['category']).any().sum())
both codes give the same output:
category
student 2
employee 1
but, this is the output that I expect:
category
student 2
employee 0
can you help me fix this?
Upvotes: 1
Views: 54
Reputation: 863801
If need count True
s values need aggregate sum
, because True
s are processing like 1
and False
like 0
:
s = (DataFrame['status'].eq('yes').groupby(DataFrame['category']).sum())
print (s)
category
employee 0
student 2
Name: status, dtype: int64
If aggregate nunique
get count of unique values in first True, False
return 2
and in second No
return 1
(no Yes
for second group).
For testing check unique values per groups:
print ((DataFrame['status'].eq('yes').groupby(DataFrame['category']).unique()))
category
employee [False]
student [True, False]
Name: status, dtype: object
If use any
it test if at least one True
per groups, so ouput is different:
print ((DataFrame['status'].eq('yes').groupby(DataFrame['category']).any()))
category
employee False
student True
Name: status, dtype: bool
Upvotes: 1