Shew
Shew

Reputation: 1596

Creating a new column from pairwise row entries in pandas

I have a dataframe as given below

>>> df
    t    c  f  e
0   1  100  2  1
1   1  200  1  1
2   1  300  4  0
3   1  400  2  0
4   2  100  3  1
5   2  200  3  1
6   2  300  4  1
7   2  400  1  0
8   3  100  4  0
9   3  200  3  0
10  3  300  1  1
11  3  400  4  1
12  4  100  1  1
13  4  200  4  1
14  4  300  4  1
15  4  400  2  1

I want to add a new column using pairwise information of the rows. In the above case, I want to add a new column 'rr' with value 1 if i-th row and (i+4)-th row has same value for column 'e' (0, in case i+4 index does not exist) and similarly I also want to add another column 'rr2' is i-th row and (I+1)-th row has same value for column 'e'.

>>> df
    t    c  f  e rr rr2
0   1  100  2  1 1  1    
1   1  200  1  1 0  1
2   1  300  4  0 1  0
3   1  400  2  0 0  1
4   2  100  3  1 1  0
5   2  200  3  1 1  0
6   2  300  4  1 0  1
7   2  400  1  0 1  0
8   3  100  4  0 1  0
9   3  200  3  0 0  1
10  3  300  1  1 1  1
11  3  400  4  1 1  1
12  4  100  1  1 1  0
13  4  200  4  1 1  0
14  4  300  4  1 1  0
15  4  400  2  1 1  0

My idea was using the apply method

X['rr'] = X.apply(lambda x: func1(x),axis=1 )
X['rr2'] = X.apply(lambda x: func2(x),axis=1 )

But in that case, I will not be able to access the i+1 or i+4 indices of the original dataframe. Is there a way to do this efficiently, rather than going through each row one-by-one.

Upvotes: 0

Views: 63

Answers (1)

ansev
ansev

Reputation: 30920

Use Series.shift

df['rr'] = df['e'].eq(df['e'].shift(-4)).astype(int)
df['rr2'] = df['e'].eq(df['e'].shift(-1)).astype(int)
print(df)

    t    c  f  e  rr  rr2
0   1  100  2  1   1    1
1   1  200  1  1   1    0
2   1  300  4  0   0    1
3   1  400  2  0   1    0
4   2  100  3  1   0    1
5   2  200  3  1   0    1
6   2  300  4  1   1    0
7   2  400  1  0   0    1
8   3  100  4  0   0    1
9   3  200  3  0   0    0
10  3  300  1  1   1    1
11  3  400  4  1   1    1
12  4  100  1  1   0    1
13  4  200  4  1   0    1
14  4  300  4  1   0    1
15  4  400  2  1   0    0

Note:

When NaN is compared the result always returns False

Upvotes: 1

Related Questions