manwong0606
manwong0606

Reputation: 147

Pandas DataFrame create new columns based on a logic dependent on other columns with cumulative counting rule

I have a DataFrame originally as follows:

d1={'on':[0,1,0,1,0,0,0,1,0,0,0],'off':[0,0,0,0,0,0,1,0,1,0,1]}

Original

My end objective is to add a new column 'final' where it will show a value of '1' once an 'on' indicator' is triggered (ignoring any duplicate) but then 'final' is switched back to '0' if the 'off' indicator is triggered AND ONLY when the 'on' sign was triggered for 3 rows. I did try coming up with any code but failed to tackle it at all.

My desired output is as follows:

Desired

Column 'final' is first triggered in row 1 when the 'on' indicator is switched to 1. 'on' indictor in row 3 is ignored as it is just a redundant signal. 'off' indictor at row 6 is triggered and the 'final' value is switched back to 0 because it has been turned on for more than 3 rows already, unlike the case in row 8 where the 'off' indicator is triggered but the 'final' value cannot be switched off until encountering another 'off' indicator in row 10 because that was the time when the 'final' value has been switched off for > 3 rows.

Thank you for assisting. Appreciate.

Upvotes: 1

Views: 68

Answers (2)

Алексей Р
Алексей Р

Reputation: 7627

import pandas as pd

d1 = {'on': [0, 1, 0, 1, 0, 0, 0, 1, 0, 0, 0], 'off': [0, 0, 0, 0, 0, 0, 1, 0, 1, 0, 1]}
df = pd.DataFrame(d1)
df['final'], status, hook = 0, 0, 0

for index, row in df.iterrows():
    hook = index if row['on'] else hook
    row['final'] = status = int((row['on'] or status) and (not (row['off'] and index - hook > 2)))
print(df)

Output:

         on  off  final
    0    0    0      0
    1    1    0      1
    2    0    0      1
    3    1    0      1
    4    0    0      1
    5    0    0      1
    6    0    1      0
    7    1    0      1
    8    0    1      1
    9    0    0      1
    10   0    1      0

Upvotes: 0

Andrej Kesely
Andrej Kesely

Reputation: 195468

One solution using a "state machine" implemented with yield:

def state_machine():
    on, off = yield
    cnt, current = 0, on
    while True:
        current = int(on or current)
        cnt += current

        if off and cnt > 3:
            cnt = 0
            current = 0

        on, off = yield current


machine = state_machine()
next(machine)

df = pd.DataFrame(d1)
df['final'] = df.apply(lambda x: machine.send((x['on'], x['off'])), axis=1)

print(df)

Prints:

    on  off  final
0    0    0      0
1    1    0      1
2    0    0      1
3    1    0      1
4    0    0      1
5    0    0      1
6    0    1      0
7    1    0      1
8    0    1      1
9    0    0      1
10   0    1      0

Upvotes: 2

Related Questions