vitalstrike82
vitalstrike82

Reputation: 301

How to insert a new column beside each dataframe columns

Just started to code using python and I would want to perform a growth calculation (Year2/Year1)-1 to find out YoY growth.

This is the original data frame I loaded from excel

        Revenue Income  Profit
2015    10000   5000    2000
2016    20000   6000    3000
2017    30000   7000    4000
2018    40000   8000    5000
2019    50000   9000    6000

The is the end state I would like to have in financialData_df dataframe with YoY calculation beside each original column to display the simple growth

       Revenue  YoY    Income   YoY   Profit    YoY
2015    10000   NaN     5000    NaN     2000    NaN
2016    20000   1.00    6000    0.20    3000    0.50
2017    30000   0.50    7000    0.17    4000    0.33
2018    40000   0.33    8000    0.14    5000    0.25
2019    50000   0.25    9000    0.13    6000    0.20

I have written a code to load the excel into a dataframe named financialData.

Using a for loop, I iterated each column to calculate the YoY return and store it into yearOnYear_df.

However, I do not want to keep a separate dataframe for all the calculations.

Can guide me how to insert a new column after using the .shift() on each column in the for loop or is there any easier method?

Please find my code below Thanks

# Load income statement
financialData_df = pd.read_excel('C:/Statement_Annual.xls', index_col = 'Name')

#Retrieve the list of columns names to use for iteration later
columnsList = list(financialData_df)


yearOnYear_df = pd.DataFrame()

# Start to calculate the YoY growth using dataframe .shift() method
for column in columnsList:
    yearOnYear_df[column]= (financialData_df[column] / financialData_df[column].shift(1)) - 1

yearOnYear_df.to_excel('C:/Result.xlsx')

Upvotes: 2

Views: 396

Answers (1)

Andy L.
Andy L.

Reputation: 25269

Edit: @ALollz reminds me of pct_change :)). It is simple as this

df1 = df.join(df.pct_change().add_prefix('YoY_'))
df1 = df1.reindex(columns=['Revenue', 'YoY_Revenue', 
                           'Income', 'YoY_Income', 
                           'Profit', 'YoY_Profit'])

Out[342]:
      Revenue  YoY_Revenue  Income  YoY_Income  Profit  YoY_Profit
2015    10000          NaN    5000         NaN    2000         NaN
2016    20000     1.000000    6000    0.200000    3000    0.500000
2017    30000     0.500000    7000    0.166667    4000    0.333333
2018    40000     0.333333    8000    0.142857    5000    0.250000
2019    50000     0.250000    9000    0.125000    6000    0.200000

Original:
Simple calculations and join

df1 = df.join((df / df.shift() - 1).add_prefix('YoY_'))

Out[305]:
      Revenue  Income  Profit  YoY_Revenue  YoY_Income  YoY_Profit
2015    10000    5000    2000          NaN         NaN         NaN
2016    20000    6000    3000     1.000000    0.200000    0.500000
2017    30000    7000    4000     0.500000    0.166667    0.333333
2018    40000    8000    5000     0.333333    0.142857    0.250000
2019    50000    9000    6000     0.250000    0.125000    0.200000

Upvotes: 2

Related Questions