Pandas Rolling Change Thresholds

Question

Let's say I have two columns, 'a' and 'b', of time series data in a Pandas dataframe. I want to create a third column that indicates if the difference between column 'a' at the current time period and column 'b' at any of the next 5 time periods increases by 8 or more before it decreases by 2 or more. Ideally I'd do this using some form of df.rolling(5).apply() and without any loops, but I keep running into challenges.

For demonstration's sake, I've written the logic out with loops but if anyone can give me some direction on how to do this more efficiently or elegantly, I'd really appreciate it. In reality the dataframe and the window will be much larger.

df = pd.DataFrame({'a':[1,2,3,4,5,6,7,8,9,10], 'b':[1,0,9,0,15,0,20,15,23,6]})
df['c'] = 0

window = 5
positive_thresh = 8
negative_thresh = -2
num_rows = df.shape[0]

for a_idx in range(num_rows):
    a_start = df.iloc[a_idx,0]
    b_roll = df.iloc[(a_idx + 1):max(a_idx + 1 + window,num_rows), 1]
    deltas = b_roll - a_start
    positives = deltas[deltas>=positive_thresh]
    negatives = deltas[deltas<=negative_thresh]
    first_pos_idx = positives.index[0] if len(positives) > 0 else num_rows
    first_neg_idx = negatives.index[0] if len(negatives) > 0 else num_rows
    
    if first_pos_idx < first_neg_idx:
        df.iloc[a_idx,2] = 1

print(df)

    a   b  c
0   1   1  1
1   2   0  0
2   3   9  0
3   4   0  1
4   5  15  0
5   6   0  1
6   7  20  1
7   8  15  1
8   9  23  0
9  10   6  0

Pandas Rolling Change Thresholds

Answers (1)

Related Questions