Reputation: 301
Just started to code using python and I would want to perform a growth calculation (Year2/Year1)-1 to find out YoY growth.
This is the original data frame I loaded from excel
Revenue Income Profit
2015 10000 5000 2000
2016 20000 6000 3000
2017 30000 7000 4000
2018 40000 8000 5000
2019 50000 9000 6000
The is the end state I would like to have in financialData_df dataframe with YoY calculation beside each original column to display the simple growth
Revenue YoY Income YoY Profit YoY
2015 10000 NaN 5000 NaN 2000 NaN
2016 20000 1.00 6000 0.20 3000 0.50
2017 30000 0.50 7000 0.17 4000 0.33
2018 40000 0.33 8000 0.14 5000 0.25
2019 50000 0.25 9000 0.13 6000 0.20
I have written a code to load the excel into a dataframe named financialData.
Using a for loop, I iterated each column to calculate the YoY return and store it into yearOnYear_df.
However, I do not want to keep a separate dataframe for all the calculations.
Can guide me how to insert a new column after using the .shift() on each column in the for loop or is there any easier method?
Please find my code below Thanks
# Load income statement
financialData_df = pd.read_excel('C:/Statement_Annual.xls', index_col = 'Name')
#Retrieve the list of columns names to use for iteration later
columnsList = list(financialData_df)
yearOnYear_df = pd.DataFrame()
# Start to calculate the YoY growth using dataframe .shift() method
for column in columnsList:
yearOnYear_df[column]= (financialData_df[column] / financialData_df[column].shift(1)) - 1
yearOnYear_df.to_excel('C:/Result.xlsx')
Upvotes: 2
Views: 396
Reputation: 25269
Edit: @ALollz reminds me of pct_change
:)). It is simple as this
df1 = df.join(df.pct_change().add_prefix('YoY_'))
df1 = df1.reindex(columns=['Revenue', 'YoY_Revenue',
'Income', 'YoY_Income',
'Profit', 'YoY_Profit'])
Out[342]:
Revenue YoY_Revenue Income YoY_Income Profit YoY_Profit
2015 10000 NaN 5000 NaN 2000 NaN
2016 20000 1.000000 6000 0.200000 3000 0.500000
2017 30000 0.500000 7000 0.166667 4000 0.333333
2018 40000 0.333333 8000 0.142857 5000 0.250000
2019 50000 0.250000 9000 0.125000 6000 0.200000
Original:
Simple calculations and join
df1 = df.join((df / df.shift() - 1).add_prefix('YoY_'))
Out[305]:
Revenue Income Profit YoY_Revenue YoY_Income YoY_Profit
2015 10000 5000 2000 NaN NaN NaN
2016 20000 6000 3000 1.000000 0.200000 0.500000
2017 30000 7000 4000 0.500000 0.166667 0.333333
2018 40000 8000 5000 0.333333 0.142857 0.250000
2019 50000 9000 6000 0.250000 0.125000 0.200000
Upvotes: 2