Sid
Sid

Reputation: 4055

Loop through dataframe, once condition is met, start loop again from where the condition was met?

I am trying to loop through a dataframe and get the index values into a dictionary where two conditions are met. And then start iterating over the rows again from the last index value where condition was met.

I have this so far

d = {}
index_number = 12
for i, r in df.iloc[index_number:].iterrows():
    print(index_number)
    if r['Entry'] == 'Y':
        print(i)
        ix_num = i + 1

        for e, ro in df.iloc[ix_num:].iterrows():
            if ro['Exit'] == 'E':
                d[i] = e
                index_number = e

                print(e)

                input('Check')
                break

df:    
    Entry   Exit
12       Y  NaN
13       Y  NaN
14       Y    E
15       Y    E
16       Y  NaN
17       Y  NaN
18       Y  NaN
19     NaN    E
20       Y  NaN
21     NaN    E
22     NaN    E
23     NaN    E
24       Y  NaN
25       Y  NaN
26     NaN    E
27       Y  NaN
28     NaN    E
29     NaN    E

The problem I am facing is, for some reason index_number is not being used for the first loop when the conditions are both met.

Expected output:

d = {12:14,15:19,20:21,24:26,27:28}

Thanks for your help

Edit:

I am using the following for now:

v = []
x = []
for i, r in df.iterrows():
    if r['Entry'] == 'Y':
        x.append(i)
    if r['Exit'] == 'E':
        v.append(i)


d = {}
exce = []
check_val = 0
for i in x:
    if i > check_val:
        for e in v:
            if e>i and e not in exce:
                d[i] = e
                exce.append(e)
                check_val = e
                break

Upvotes: 0

Views: 321

Answers (1)

giser_yugang
giser_yugang

Reputation: 6166

Vectorization operation:

df = pd.concat([df[df.Exit=='E']['Exit'],df[df.Entry=='Y']['Entry']])
df = df.reset_index().rename(columns = {0:'label'}).sort_values('index')
df = df[df['label']!=df['label'].shift(1)]
df['E_index'] = df['index'].shift(-1)
df = df[(df['label']+df['label'].shift(-1))=='YE']
d = dict(zip(df['index'].astype(int), df['E_index'].astype(int)))
print(d)

{12: 14, 15: 19, 20: 21, 24: 26, 27: 28}

Upvotes: 1

Related Questions