MadMax
MadMax

Reputation: 183

Pandas DataFrame Group-by select column based on boolean condition

I want to group a dataframe by unique column names and count a nother column for each group but only if the condition of a thrid column is met:

This would be the three columns:

df = pd.DataFrame({'col1':['a','b','b'], 'col2':['c1', 'c2', 'c3'], 'col3':[20,10,20]})

dataframe

ToDo: count col2 for grouped by col1 if col3 == 20
I managed to do it in 3 steps:

df = df.set_index(['col1'])
df = df.loc[df['col3']==20,'col2']
df.groupby(level=0).count().idxmax()

but I want to know if it would be possible to do it with a lambda expression in one line :)

Upvotes: 1

Views: 1498

Answers (2)

Ricky
Ricky

Reputation: 2750

The apply version

df[df['col3']==20].groupby('col1').apply(lambda row: row['col2'].count())

Upvotes: 0

Alihan ÖZ
Alihan ÖZ

Reputation: 171

You can filter first and then group and count:

df[df['col3']==20].groupby('col1')['col2'].count()

Upvotes: 1

Related Questions