Amir Katorza
Amir Katorza

Reputation: 35

Calculate diff of values in a column under condition

I Have a DF with columns ['Region', 'Country', 'Year', 'Yearly_rank'].

I need to group by region and country and calculate for each country the change in 'Yearly_rank' between years 2016-2019

df = pd.DataFrame([{'Region': 'west europe', 'Country': 'Finland', 'Year': 2019, 'Yearly_rank': 1}, {'Region': 'west europe', 'Country': 'Denmark', 'Year': 2019, 'Yearly_rank': 2}, {'Region': 'west europe', 'Country': 'Norway', 'Year': 2019, 'Yearly_rank': 3}, {'Region': 'west europe', 'Country': 'Iceland', 'Year': 2019, 'Yearly_rank': 4}, {'Region': 'west europe',

'Country': 'Netherlands', 'Year': 2019, 'Yearly_rank': 5}, {'Region': 'west europe', 'Country': 'Switzerland', 'Year': 2019, 'Yearly_rank': 6}, {'Region': 'west europe', 'Country': 'Sweden', 'Year': 2019, 'Yearly_rank': 7}, {'Region': 'australia and new zealand', 'Country': 'New Zealand', 'Year': 2019, 'Yearly_rank': 8}, {'Region': 'north america', 'Country': 'Canada', 'Year': 2019, 'Yearly_rank': 9}, {'Region': 'west europe', 'Country': 'Austria', 'Year': 2019, 'Yearly_rank': 10}]

Upvotes: 0

Views: 21

Answers (1)

Code Different
Code Different

Reputation: 93161

Try this:

cond = df["Year"].isin([2016, 2019])
change = df[cond].sort_values("Year").groupby(["Region", "Country"])["Yearly_rank"].diff()
df.assign(change=change).sort_values("change").groupby("Region").head(1)

Upvotes: 2

Related Questions