Isha
Isha

Reputation: 55

How to calculate percentage change between two years and insert in a new DataFrame in Pandas?

I have a huge Dataframe that looks like this:

year    country     population  
1971    Afghanistan 11500000    
1972    Afghanistan 11800000    
1973    Afghanistan 12100000    
1974    Afghanistan 12400000    
1975    Afghanistan 12700000    

I want to create a new DataFrame that will calculate the percentage difference in population, for every decade, grouped by country

country      1971-1980   1981-1990 1991-2000 2001-2010
Afghanistan  --          --        --        --
Australia    --          --        --        --

Need some help to understand how this can be done. Any help would be appreciated.

Upvotes: 2

Views: 483

Answers (1)

jezrael
jezrael

Reputation: 862611

You can create decade column, then use DataFrame.pivot_table with sum and add DataFrame.pct_change:

d = df['year'] // 10 * 10
df['dec'] = (d + 1).astype(str) + '-' + (d + 10).astype(str)

Another idea with cut:

bins = range(df['year'].min(), df['year'].max() + 10, 10)
labels = [f'{i}-{j-1}' for i, j in zip(bins[:-1], bins[1:])] 

df['dec'] = pd.cut(df.year, bins=bins, labels=labels, include_lowest=True)

df1 = (df.pivot_table(index='country', 
                    columns='dec', 
                    values='population', 
                    aggfunc='sum')
        .pct_change(axis=1))

Upvotes: 2

Related Questions