Elham
Elham

Reputation: 867

checking condition in a loop (Pandas)

Consider the following dataframe:

df = pd.DataFrame(data=np.array([['a',1, 2, 3,'T'], ['a',4, 5, 6,'F'],
                                 ['b',7, 8, 9,'T'], ['b',10, 11 , 12,'T'], ['b',13, 14 , 15,'F']])
                  , columns=['id','A', 'B', 'C','T/F'])

  id   A   B   C T/F
0  a   1   2   3   T
1  a   4   5   6   F
2  b   7   8   9   T
3  b  10  11  12   T
4  b  13  14  15   F

I want to apply a condition to T/F column that will copy the rows of each id with T label to further columns of the same id. For example,I need the following result:

  id   A   B   C T/F
0  a   1   2   3   T       1   2   3   T
1  a   4   5   6   F       1   2   3   T
2  b   7   8   9   T       7   8   9   T
3  b  10  11  12   T       7   8   9   T
4  b  13  14  15   F       7   8   9   T
5  b   7   8   9   T      10   11 12   T
6  b  10  11  12   T      10   11 12   T
7  b  13  14  15   F      10   11 12   T

here is my script:

n = np.array(df.groupby('id').size())
m = len(df.groupby('id'))
Cnt = 0
df4 = pd.DataFrame()

for prsnNo in range(m):
    for i in range(n[prsnNo]):
        v = df.iloc[Cnt: Cnt + n[prsnNo], :].groupby('id').cumcount() == i
        df1 = df.iloc[Cnt: Cnt + n[prsnNo], :].where(v)
        temp = df4
        df4 = df.iloc[Cnt: Cnt + n[prsnNo], :].merge(df1, on="id", how="left")
        df4 = pd.concat([temp, df4])
    Cnt += n[prsnNo]

I do not know how to add a condition to check the value of T/F column in my loop.If I add the if condition in my loop it gives me an error.

for prsnNo in range(m):
        for i in range(n[prsnNo]):
          if df[df['T/F'] =='T'] :
            v = df.iloc[Cnt: Cnt + n[prsnNo], :].groupby('id').cumcount() == i
            df1 = df.iloc[Cnt: Cnt + n[prsnNo], :].where(v)
            temp = df4
            df4 = df.iloc[Cnt: Cnt + n[prsnNo], :].merge(df1, on="id", how="left")
            df4 = pd.concat([temp, df4])
        Cnt += n[prsnNo]

Thanks,

Upvotes: 1

Views: 542

Answers (1)

cs95
cs95

Reputation: 402363

If order doesn't matter, you can use groupby + first, and then perform a merge with df and the grouped result.

v = df.groupby(['id', df['T/F'].eq('T').cumsum()])\
                 .first().reset_index(level=1, drop=True)

df = df.merge(v, left_on='id', right_index=True)  
df.columns = df.columns.str.split('_').str[0]

df

  id   A   B   C T/F   A   B   C T/F
0  a   1   2   3   T   1   2   3   T
1  a   4   5   6   F   1   2   3   T
2  b   7   8   9   T   7   8   9   T
2  b   7   8   9   T  10  11  12   T
3  b  10  11  12   T   7   8   9   T
3  b  10  11  12   T  10  11  12   T
4  b  13  14  15   F   7   8   9   T
4  b  13  14  15   F  10  11  12   T

Upvotes: 1

Related Questions