ldp
ldp

Reputation: 55

Pandas groupby method to aggregate based on string contained in column

New to Pandas/Python (student). I have what should be a simple problem but every approach I try fails.

Dataset has "country" column and "indicator" column. Countries appear >1 time. Indicator col tells us who is pro-vaccine ("Vac_plan" and "Vac_done") and who is not (as well as other info). I simply want a total for each country based on the count of who is pro-vaccine for that respective country., e.g.,

Ethiopia  7
Nigeria   5

My latest failed attempts are below:

vaccines_by_country=df.groupby('country')['indicator'=='Vac_plan|Vac_done'].count()

and...

df.groupby(['country']).str.contains('Vac_plan|Vac_done').count() 

TIA for your merciful help.

Upvotes: 1

Views: 2220

Answers (1)

user17242583
user17242583

Reputation:

You're quite close in your second attempt; you just need to reverse the order of actions. First find the strings, then group:

df['indicator'].str.contains('Vac_plan|Vac_done').groupby(df['country']).sum()

Upvotes: 2

Related Questions