Reputation: 793
I have a return
DataFrame
with daily return data and a start
DataFrame
which defines the start dates for calculating the cumulative returns over x days.
returns:
01K 02K 03K 04K
Dates
2021-01-01 1.02 1.05 1.02 1.01
2021-01-02 0.97 0.99 1.02 1.06
2021-01-03 1.03 1.07 0.98 1.02
2021-01-04 0.96 1.02 1.03 0.98
2021-01-05 1.02 1.02 1.09 1.03
2021-01-06 1.06 0.95 0.96 0.99
start:
01K 02K 03K 04K
Dates
2021-01-01 1 0 0 0
2021-01-02 0 1 0 0
2021-01-03 0 0 0 0
2021-01-04 1 0 1 0
2021-01-05 0 0 1 0
2021-01-06 0 0 0 1
I try to calculate the cumulative return (cum_returns
) from the start day until two days later and I want to assign that value to the location of the start day in the DataFrame
. For example:
At the end of the DataFrame which no return information for all three days, I try to get an NaN value instead of the cumulative return until that date. This is what I try to get:
cum_returns:
01K 02K 03K 04K
Dates
2021-01-01 1.019 0.000 0.000 0.000
2021-01-02 0.000 1.080 0.000 0.000
2021-01-03 0.000 0.000 0.000 0.000
2021-01-04 1.038 0.000 1.078 0.000
2021-01-05 0.000 0.000 NaN 0.000
2021-01-06 0.000 0.000 0.000 NaN
For reproducibility:
import pandas as pd
import numpy as np
returns = pd.DataFrame({
'Dates':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06'],
'01K':[0.02, -0.03, 0.03, -0.04, 0.02, 0.06],
'02K':[0.05, -0.01, 0.07, 0.02, 0.02, -0.05],
'03K':[0.02, 0.02, -0.02, 0.03, 0.09, -0.04],
'04K':[0.01, 0.06, 0.02, -0.02, 0.03, -0.01]})
returns = returns.set_index('Dates')
returns = returns + 1
start = pd.DataFrame({
'Dates':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06'],
'01K':[1, 0, 0, 1, 0, 0],
'02K':[0, 1, 0, 0, 0, 0],
'03K':[0, 0, 0, 1, 1, 0],
'04K':[0, 0, 0, 0, 0, 1]})
start = start.set_index('Dates')
Thank you very much
Upvotes: 0
Views: 219
Reputation: 1887
Try:
returns.shift(-2) * returns.shift(-1) * returns * start
This returns:
01K 02K 03K 04K
Dates
2021-01-01 1.019082 0.000000 0.000000 0.0
2021-01-02 0.000000 1.080486 0.000000 0.0
2021-01-03 0.000000 0.000000 0.000000 0.0
2021-01-04 1.037952 0.000000 1.077792 0.0
2021-01-05 NaN NaN NaN NaN
2021-01-06 NaN NaN NaN NaN
If you want less decimals, simply add round
:
(returns.shift(-2) * returns.shift(-1) * returns * start).round(3)
01K 02K 03K 04K
Dates
2021-01-01 1.019 0.000 0.000 0.0
2021-01-02 0.000 1.080 0.000 0.0
2021-01-03 0.000 0.000 0.000 0.0
2021-01-04 1.038 0.000 1.078 0.0
2021-01-05 NaN NaN NaN NaN
2021-01-06 NaN NaN NaN NaN
This is not the prettiest or most pythonic solution, but I don't have the headspace right now to present you a better one, perhaps in a bit. This should work though:
for i in range(3):
if i == 0:
df = returns.copy()
else:
df = returns.shift(-i) * df
(df * start).round(3)
You can change range(3)
to range(20)
in your case.
Upvotes: 2