Reputation: 65
I have the following dataframe df1
:
Bank Rate_1Y%
Bank of America 2
Barclays 0.75
Nationalbanken 0.05
Deutsche Bank 0
UBS -0.75
I have the following dataframe df2
:
0
2010-12-31 2010-12-31
2011-12-31 2011-12-31
2012-12-31 2012-12-31
2013-12-31 2013-12-31
2014-12-31 2014-12-31
2015-12-31 2015-12-31
2016-12-31 2016-12-31
2017-12-31 2017-12-31
2018-12-31 2018-12-31
2019-12-31 2019-12-31
I have an input values:
Input_Balance = 10000
Start_Date = '2010-01-01'
End_Date = '2020-01-01'
freq = '1Y'
I created new df2 with time column:
DatetimeIndex(['2010-12-31', '2011-12-31', '2012-12-31', '2013-12-31',
'2014-12-31', '2015-12-31', '2016-12-31', '2017-12-31',
'2018-12-31', '2019-12-31'],
dtype='datetime64[ns]
Can anyone help to find a simple function solution for calculating Input_Balance change, for period which represented as End_Date - Start_Date
.
I want to have a new column in df2 which represents end balance calculation for custom banks, in this case i use Bank of America.
Expected Output:
Date End Balance
2010-12-31 10200$
2011-12-31 10200$
2012-12-31 10200$
There is need to write down in a column end balance for custom banks, for selected periods (start period to end period)
Upvotes: 1
Views: 138
Reputation: 23099
IIUC, you need to recursively add the interest to the current value?
I assume that df will contain the interest rate and banks,
and that df2 will have the start date.
then we could do a cartesian product to create a new df then apply a loop to work on the rows.
# cartesian product.
df3 = (
df.assign(key=1)
.merge(df2.assign(key=1), on="key")
.drop("key", axis=1)
)
#Get indices of first instance of each bank. Assuming your data is ordered by datetime.
indices = df3.drop_duplicates(subset='Bank',keep='first').index.tolist()
# calculate the first interest value.
df3.loc[indices,'Value'] = value + (value * (df3['Rate_1Y%'] / 100))
# Calculate the rest of the data frame.
for i in range(1, len(df3)):
df3.loc[i, 'Value'] = df3.loc[i-1, 'Value'] + (df3.loc[i-1, 'Value'] * (df3.loc[i, 'Rate_1Y%'] / 100))
print(df3)
Bank Rate_1Y% Date Value
0 Bank of America 2.00 2010-12-31 10200.000000
1 Bank of America 2.00 2011-12-31 10404.000000
2 Bank of America 2.00 2012-12-31 10612.080000
3 Bank of America 2.00 2013-12-31 10824.321600
4 Bank of America 2.00 2014-12-31 11040.808032
5 Bank of America 2.00 2015-12-31 11261.624193
6 Bank of America 2.00 2016-12-31 11486.856676
7 Bank of America 2.00 2017-12-31 11716.593810
8 Bank of America 2.00 2018-12-31 11950.925686
9 Bank of America 2.00 2019-12-31 12189.944200
as a function, feel free to change edit as your needs.
def calc_interest(dataframe_1, dataframe_2, col_name='Rate_1Y%'):
df3 = (
dataframe_1.assign(key=1)
.merge(dataframe_2.assign(key=1), on="key")
.drop("key", axis=1)
)
indices = df3.drop_duplicates(subset='Bank',keep='first').index.tolist()
df3.loc[indices,'Value'] = value + (value * (df3[col_name] / 100))
for i in range(1, len(df3)):
df3.loc[i, 'Value'] = df3.loc[i-1, 'Value'] + (df3.loc[i-1, 'Value'] * (df3.loc[i, 'Rate_1Y%'] / 100))
Upvotes: 1
Reputation: 91
If I understood the OP's problem correctly, and that each row of df2
should correspond to the current balance at time t given an initial balance at Start_Date
, then I would go this way:
from datetime import datetime, timedelta
def compute_balance(input_balance,
prev_date,
end_date,
time_interval,
rate_by_bank,
data=None,
):
"""
Recursively compute balance at time t given yearly rate
:param input_balance: initial input balance (x0)
:param prev_date: datetime.datetime object specifying starting date
:param end_date: datetime.datetime object specifying ending date
:param time_interval: time interval in days
:param rate_by_bank: a dictionary providing change rate per bank {bank_name: rate, ...}
:param data: List of dictionary (must not be set by user)
:return pandas.DataFrame
"""
if data is None:
data = [{
'time': prev_date,
**{
bank_name: input_balance
for bank_name, _ in rate_by_bank
}
}]
nb_days_per_year = 365.0
normalized_time_interval = time_interval/nb_days_per_year
cur_date = prev_date + timedelta(days=time_interval)
if cur_date >= end_date:
return pd.DataFrame(data).set_index('time')
balance_per_bank = {
bank_name: (data[-1][bank_name]
+ (rate/100.0) * normalized_time_interval * data[-1][bank_name]
)
for bank_name, rate in rate_by_bank
}
data.append({
'time': cur_date,
**balance_per_bank
})
return compute_balance(input_balance, cur_date, end_date, time_interval, rates, data)
# Input variables
Input_Balance = 10000
Start_Date = '2010-01-01'
End_Date = '2020-01-01'
# convert df_1 to dictionary to get rate per bank
rates = df_1.to_dict(orient='split')['data']
# convert dates to datetime objects
start_date = pd.Timestamp(datetime.strptime(Start_Date, '%Y-%d-%m'))
end_date = pd.Timestamp(datetime.strptime(End_Date, '%Y-%d-%m'))
df_2 = compute_balance(Input_Balance, start_date, end_date, 365, rates)
Which should then output:
Bank of America Barclays Deutsche Bank NationalBanken \
time
2010-01-01 10000.0000 10000.000000 10000.0 10000.000000
2011-01-01 10200.0000 10075.000000 10000.0 10005.000000
2012-01-01 10404.0000 10150.562500 10000.0 10010.002500
2012-12-31 10612.0800 10226.691719 10000.0 10015.007501
2013-12-31 10824.3216 10303.391907 10000.0 10020.015005
UBS
time
2010-01-01 10000.000000
2011-01-01 9925.000000
2012-01-01 9850.562500
2012-12-31 9776.683281
2013-12-31 9703.358157
Upvotes: 4
Reputation: 568
If you need to create new column in df2
, just write:
from datetime import datetime
import pandas as pd
df2.reset_index(name='Start_Date', inplace=True)
df2['End_Date'] = '2020-01-01' #or any required value
df2['Start_Date'] = pd.to_datetime(df2['Start_Date'])
df2['End_Date'] = pd.to_datetime(df2['End_Date'])
df2['Input_Balance'] = df2['End_Date']- df2['Start_Date']
If you need to create new column for custom bank, it means that name of Bank should be in df2 too. Another way it in usage groupby
with aggregation.
It would be great to have examples of df1
, df2
and clear expected result in view of df2
...
Upvotes: 0