gabboshow
gabboshow

Reputation: 5569

use conditions after groupby

I have a pandas dataframe df

import pandas

df = pandas.DataFrame(
    data=[["A", "Man"], ["A", "Woman"], ["A", "Man"], ["A", "Man"], ["B", "Woman"]],
    columns=["category", "gender"],
)

df
  category gender
0        A    Man
1        A  Woman
2        A    Man
3        A    Man
4        B  Woman

and I count how many men and women are in each category

grouped = df.groupby(by=["category", "gender"])["gender"].count()

grouped
category  gender
A         Man       3
          Woman     1
B         Woman     1
Name: gender, dtype: int64

how can I get a list of categories for which both men and women are more than 1?

category_list = [A]

Upvotes: 1

Views: 43

Answers (2)

Quang Hoang
Quang Hoang

Reputation: 150755

IIUC,

s = df.groupby('category')['gender'].value_counts().unstack(fill_value=0)

s[s.ge(1).all(1)]

gives you

gender    Man  Woman
category            
A           3      1

Upvotes: 2

Yilun Zhang
Yilun Zhang

Reputation: 9018

You can just convert the result to a dataframe and then apply query filter:

pandas.DataFrame(grouped).query("gender > 1")

                    gender
category    gender  
A              Man       3

Or you can directly do:

grouped[grouped > 1]

Upvotes: 1

Related Questions