Soma Anchal
Soma Anchal

Reputation: 45

Groupby() in pandas in Python

I have a dataset with the following columns:

Country, Year, Population, Suicide case, Country GDP

Problem: I Want to calculate (Suicide case / Population )*100 for each country

My Approach :

import pandas as pd
fileName = pd.read_csv("File Path")
pd.groupby("Country")

How should I extend my code for the calculation above?

Upvotes: 0

Views: 95

Answers (3)

Loochie
Loochie

Reputation: 2472

Also a more concise one is:

df.groupby('Country').apply(lambda x: x['Suicide case'].sum()/
                               float(x['Population'].sum())*100)

Upvotes: 2

Ghanshyam Savaliya
Ghanshyam Savaliya

Reputation: 608

If I understood your question correctly then you can try below code to get your desired result:

fileName = fileName.groupby(['Year','Country']).sum()
fileName['New_var'] = (fileName['Suicide case']/ fileName['Population'])*100

you also need to the year in the group otherwise year-wise will also get aggregate.

Upvotes: 1

Angelo
Angelo

Reputation: 655

Here you have with an example. May be it could be better, but this should work for you.

import pandas as pd
df = pd.DataFrame({"Country":["France", "UK", "France", "UK"], 
                   "Population":[1, 2, 3, 4],
                   "Suicide case":[5, 3, 6, 2]})
df_grouped = df.groupby("Country").sum()
(df_grouped["Suicide case"]/df_grouped["Population"])*100

Upvotes: 2

Related Questions