Reputation: 390
I have the dataframe like this:
Price Signal
0 28.68 -1
1 33.36 1
2 44.7 -1
3 43.38 1 ---- smaller than Price[2] # False: Drop row[3,4]
4 41.67 -1
5 42.17 1 ---- smaller than Price[2] # False: Drop row[5,6]
6 44.21 -1
7 46.34 1 ---- greater than Price[2] # True: Keep
8 45.2 -1
9 43.4 1 ---- Still Keep because it is the last row
My logic is keep the row if the signal 1 has price greater than the one before. If not it will drop its row and the next row since the signal must interspersed between -1 and 1 and also must compare the next signal 1 with the last one above (I have explained in the snapshot of my dataframe above).
The last one Signal 1 still keep although it is not sastified the condition because rule is the last one item of Signal column must be 1
Until now my effort is here:
def filter_sell(df):
# For export the result
filtered_sell_df = pd.DataFrame()
for i in range(0, len(df) + 1):
if df.iloc[i]["Signal"] == 1:
if df.iloc[i]["Price"] > df.iloc[i - 1]["Price"]:
pass
else:
try:
df.drop([i, i + 1])
filter_sell(df)
# Try to handle the i + 1 above since len(df) is changed
except RecursionError:
break
else:
pass
I'm new with writing recursion, thanks for your help!
Upvotes: 1
Views: 243
Reputation: 11222
You can do it without recursion
. By the way your approach will be slow because you call .drop()
inside a loop. The easiest way is just use a new column to mark a rows for deletion.
df = pd.DataFrame({
'Price': (28.68, 33.36, 44.7, 43.38, 41.67, 42.17, 44.21, 46.34, 45.2, 43.4),
'Signal': (-1, 1, -1, 1, -1, 1, -1, 1, -1, 1),
})
# column with flag for deleting unnecessary records
df['max_price'] = 1
# default max_price in first row
max_price = df['Price'].loc[0]
index = 1
# because we do not check last record
stop_index = len(df.index) - 1
while index < stop_index:
# just check max price because signal != 1
if df['Signal'].loc[index] == -1:
current = df['Price'].loc[index]
if current > max_price:
max_price = current
index += 1
continue
current = df['Price'].loc[index]
if max_price > current:
# last max_price > current
# set 'remove flag' to current and next row
df['max_price'].loc[index] = 0
df['max_price'].loc[index + 1] = 0
# increase index to 2 because next row will be removed
index += 2
continue
index += 1
# just drop records without max_price and drop column
df = df[df['max_price'] == 1]
df = df.drop(columns=['max_price'])
print(df)
Hope this helps.
Upvotes: 1