Reputation: 183
I want to group a dataframe by unique column names and count a nother column for each group but only if the condition of a thrid column is met:
This would be the three columns:
df = pd.DataFrame({'col1':['a','b','b'], 'col2':['c1', 'c2', 'c3'], 'col3':[20,10,20]})
ToDo: count col2 for grouped by col1 if col3 == 20
I managed to do it in 3 steps:
df = df.set_index(['col1'])
df = df.loc[df['col3']==20,'col2']
df.groupby(level=0).count().idxmax()
but I want to know if it would be possible to do it with a lambda expression in one line :)
Upvotes: 1
Views: 1498
Reputation: 2750
The apply version
df[df['col3']==20].groupby('col1').apply(lambda row: row['col2'].count())
Upvotes: 0
Reputation: 171
You can filter first and then group and count:
df[df['col3']==20].groupby('col1')['col2'].count()
Upvotes: 1