Reputation: 53806
For this dataframe :
columns = ['A','B', 'C']
data = np.array([[1,2,2] , [4,5,4], [7,8,18]])
df2 = pd.DataFrame(data,columns=columns)
df2['C']
If the difference between consecutive rows for column C is <= 2 then the previous and current row should be returned. So I'm attempting to filter out rows where difference for previous row > 2.
So expecting these array values to be returned :
[1,2,2]
[4,5,4]
[7,8,18]
I'm attempting to implement this functionality using the shift function :
df2[(df2.A - df2.shift(1).A >= 2)]
The result of which is :
A B C
1 4 5 4
2 7 8 18
I think need to apply function to each row in order to achieve this ?
Update :
Alternative use case :
columns = ['A','B', 'C']
data = np.array([[1,2,2] , [2,5,3], [7,8,16]])
df2 = pd.DataFrame(data,columns=columns)
df2[df2.A.diff().shift(-1) >= 2]
Returned is :
A B C
1 2 5 3
but expecting
A B C
1 2 5 3
1 7 8 16
so in this case expecting the next and current row to be returned as difference between 2 & 8 in
2 5 3
& 8 8 18
is > 2
Update 2 :
Edge case : if the last value being compared is < 2 then the row is ignored
columns = ['A','B', 'C']
data = np.array([[2,2,2] , [3,5,3], [5,8,16], [6,8,16]])
df2 = pd.DataFrame(data,columns=columns)
df2[df2.A.diff().shift(-1).ffill() >= 2]
returns :
A B C
1 3 5 3
Upvotes: 1
Views: 132