Reputation: 477
I have the following script:
df = pd.DataFrame()
df["Stake"]=[0.25,0.15,0.26,0.30,0.10,0.40,0.32,0.11,0.20,0.25]
df["Odds"]=[2.5,4.0,1.75,2.2,1.85,3.2,1.5,1.2,2.15,1.65]
df["Ftr"]=["H","D","A","H","H","A","D","H","H","A"]
df["Ind"]=[1,2,2,1,3,3,3,1,2,2]
which results in:
Stake Odds Ftr Ind
0 0.25 2.50 H 1
1 0.15 4.00 D 2
2 0.26 1.75 A 2
3 0.30 2.20 H 1
4 0.10 1.85 H 3
5 0.40 3.20 A 3
6 0.32 1.50 D 3
7 0.11 1.20 H 1
8 0.20 2.15 H 2
9 0.25 1.65 A 2
I want to create two additional columns "Start Balance" and "End Balance"."Start Balance" in index 0 is equal to 1000. "End balance" is always equal to either:
"Start Balance" - "Stake" * "Start Balance" + "Stake" x "Start Balance" x "Odds" if column "Ftr" = "H".
or,
"Start Balance" - "Stake" * "Start Balance" if column "Ftr" different than "H".
Then the next index "Start balance" becomes the preceding index "End Balance". For example "End Balance" in index 0 becomes "Start Balance" in index 1.
To make things a bit more complicated the "Start Balance" should respect one more condition. If "Ind" column is different than 1 , for example 2 then the "Start Balance" for both rows (indices 1 and 2) is equal to the "End Balance" in index 0. Likewise where "Ind" is 3 then all indices (4,5,6) should have "Start Balance" equal to the "End balance" in index 3. Expected result is:
Stake Odds Ftr Ind Start Balance End Balance
0 0.25 2.5 H 1 1000.0 1375.0
1 0.15 4 D 2 1375.0 1168.8
2 0.26 1.75 A 2 1375.0 1017.5
3 0.3 2.2 H 1 1017.5 1383.8
4 0.1 1.85 H 3 1383.8 1501.4
5 0.4 3.2 A 3 1383.8 830.3
6 0.32 1.5 D 3 1383.8 941.0
7 0.11 1.2 H 1 941.0 961.7
8 0.2 2.15 H 2 961.7 1182.9
9 0.25 1.65 A 2 961.7 721.3
I have not tried anything since I truly don't know how to approach so many conditions :). Cheers
Upvotes: 1
Views: 51
Reputation: 93161
I can't think of a vectorized function to do what you want so a for
loop is the only solution I can think of:
# A temp dataframe to keep track of the End Balance by Ind
# It's empty to start
tmp = pd.DataFrame(columns=['index', 'End Balance']).rename_axis('ind')
for index, row in df.iterrows():
stake, odds, ind = row['Stake'], row['Odds'], row['Ind']
if index == 0:
start_balance = 1000
elif row['Ind'] == 1:
start_balance = df.loc[index - 1, 'End Balance']
else:
start_balance = tmp.query('ind != @ind').sort_values('index')['End Balance'].iloc[-1]
end_balance = start_balance * (1 - stake + stake * odds) if row['Ftr'] == 'H' else start_balance * (1 - stake)
# Keep track of when the current Ind last occurs
tmp.loc[ind, ['index', 'End Balance']] = [index, end_balance]
df.loc[index, 'Start Balance'] = start_balance
df.loc[index, 'End Balance'] = end_balance
Result:
Stake Odds Ftr Ind Start Balance End Balance
0 0.25 2.50 H 1 1000.000000 1375.000000
1 0.15 4.00 D 2 1375.000000 1168.750000
2 0.26 1.75 A 2 1375.000000 1017.500000
3 0.30 2.20 H 1 1017.500000 1383.800000
4 0.10 1.85 H 3 1383.800000 1501.423000
5 0.40 3.20 A 3 1383.800000 830.280000
6 0.32 1.50 D 3 1383.800000 940.984000
7 0.11 1.20 H 1 940.984000 961.685648
8 0.20 2.15 H 2 961.685648 1182.873347
9 0.25 1.65 A 2 961.685648 721.264236
Upvotes: 1