Reputation: 361
From a pandas dataframe, I need to delete specific rows based on a condition applied on two columns of the dataframe.
The dataframe is
0 1 2 3
0 -0.225730 -1.376075 0.187749 0.763307
1 0.031392 0.752496 -1.504769 -1.247581
2 -0.442992 -0.323782 -0.710859 -0.502574
3 -0.948055 -0.224910 -1.337001 3.328741
4 1.879985 -0.968238 1.229118 -1.044477
5 0.440025 -0.809856 -0.336522 0.787792
6 1.499040 0.195022 0.387194 0.952725
7 -0.923592 -1.394025 -0.623201 -0.738013
I need to delete some rows where the difference between column 1
and columns 2
is less than threshold t
.
abs(column1.iloc[index]-column2.iloc[index]) < t
I have seen examples where conditions are applied individually on column values but did not find anything where a row is deleted based on a condition applied on multiple columns.
Upvotes: 1
Views: 126
Reputation: 862601
First select columns by DataFrame.iloc
for positions, subtract, get Series.abs
, compare by thresh with inverse
opearator like <
to >=
or >
and filter by boolean indexing
:
df = df[(df.iloc[:, 0]-df.iloc[:, 1]).abs() >= t]
If need select columns by names, here 0
and 1
:
df = df[(df[0]-df[1]).abs() >= t]
Upvotes: 2