Reputation: 1771
Is there a more efficient way to change long/short positions in pandas than what I have come up with below?
The gist of the logic:
np.nan
until there is a buy (+1) or sell (-1) signal.import numpy as np
import pandas as pd
df = pd.DataFrame({
'buy_sell': [np.nan, 1, np.nan, 1, np.nan, np.nan, 1, -1, np.nan, -1, np.nan, 1, np.nan, -1],
'position': np.nan
})
for i, r, in df.iterrows():
buy_sell = r['buy_sell']
# Check if first index
if i != 0:
last_position = df.loc[i-1, 'position']
if np.isnan(buy_sell):
df.loc[i, 'position'] = last_position
else:
if np.isnan(last_position) or last_position * buy_sell <= 0:
df.loc[i, 'position'] = buy_sell
else:
df.loc[i, 'position'] = last_position + buy_sell
else:
df.loc[i, 'position'] = buy_sell
df_expected = pd.DataFrame({
'buy_sell': [np.nan, 1, np.nan, 1, np.nan, np.nan, 1, -1, np.nan, -1, np.nan, 1, np.nan, -1],
'position': [np.nan, 1, 1, 2, 2, 2, 3, -1, -1, -2, -2, 1, 1, -1],
})
buy_sell position
0 NaN NaN
1 1.0 1.0
2 NaN 1.0
3 1.0 2.0
4 NaN 2.0
5 NaN 2.0
6 1.0 3.0
7 -1.0 -1.0
8 NaN -1.0
9 -1.0 -2.0
10 NaN -2.0
11 1.0 1.0
12 NaN 1.0
13 -1.0 -1.0
Note: The code I have above produces the expected data frame. I am asking if there is a more efficient/better way of doing what I have done above.
Upvotes: 4
Views: 894
Reputation: 4771
You can do something like this:
df["position"] = df.buy_sell.groupby(\
df.buy_sell.bfill()\
.diff()\
.abs()\
.cumsum()\
).cumsum().ffill()
Output
df
buy_sell position
0 NaN NaN
1 1.0000 1.0000
2 NaN 1.0000
3 1.0000 2.0000
4 NaN 2.0000
5 NaN 2.0000
6 1.0000 3.0000
7 -1.0000 -1.0000
8 NaN -1.0000
9 -1.0000 -2.0000
10 NaN -2.0000
11 1.0000 1.0000
12 NaN 1.0000
13 -1.0000 -1.0000
Explanation
Basically, you want some sort of cumsum
by groups created by the sign changes in the column, then, first of all we need to find those changes:
>>> df.buy_sell.bfill().diff().abs()
0 NaN
1 0.0000
2 0.0000
3 0.0000
4 0.0000
5 0.0000
6 0.0000
7 2.0000
8 0.0000
9 0.0000
10 2.0000
11 0.0000
12 2.0000
13 0.0000
As you can see, every 2
is a sign change. Now, usign pandas.Series.cumsum
we can assign a group to every row
>>> df.buy_sell.bfill().diff().abs().cumsum()
0 NaN
1 0.0000
2 0.0000
3 0.0000
4 0.0000
5 0.0000
6 0.0000
7 2.0000
8 2.0000
9 2.0000
10 4.0000
11 4.0000
12 6.0000
13 6.0000
So, every row with 0 is a group, every row with 2 is a group, and so on. With this, we can directly use pandas.Series.groupby
with pandas.core.groupby.GroupBy.cumsum
:
>>> df.buy_sell.groupby(df.buy_sell.bfill().diff().abs().cumsum()).cumsum()
0 NaN
1 1.0000
2 NaN
3 2.0000
4 NaN
5 NaN
6 3.0000
7 -1.0000
8 NaN
9 -2.0000
10 NaN
11 1.0000
12 NaN
13 -1.0000
All we need now is to propagate last valid observation forward to next valid with pandas.Series.ffill
:
>>> df.buy_sell.groupby(df.buy_sell.bfill().diff().abs().cumsum()).cumsum().ffill()
0 NaN
1 1.0000
2 1.0000
3 2.0000
4 2.0000
5 2.0000
6 3.0000
7 -1.0000
8 -1.0000
9 -2.0000
10 -2.0000
11 1.0000
12 1.0000
13 -1.0000
Upvotes: 5