MD Abid Hasan
MD Abid Hasan

Reputation: 361

Deleting/Selecting rows from pandas based on conditions on multiple columns

From a pandas dataframe, I need to delete specific rows based on a condition applied on two columns of the dataframe.

The dataframe is

          0         1         2         3
0 -0.225730 -1.376075  0.187749  0.763307
1  0.031392  0.752496 -1.504769 -1.247581
2 -0.442992 -0.323782 -0.710859 -0.502574
3 -0.948055 -0.224910 -1.337001  3.328741
4  1.879985 -0.968238  1.229118 -1.044477
5  0.440025 -0.809856 -0.336522  0.787792
6  1.499040  0.195022  0.387194  0.952725
7 -0.923592 -1.394025 -0.623201 -0.738013

I need to delete some rows where the difference between column 1 and columns 2 is less than threshold t.

abs(column1.iloc[index]-column2.iloc[index]) < t

I have seen examples where conditions are applied individually on column values but did not find anything where a row is deleted based on a condition applied on multiple columns.

Upvotes: 1

Views: 126

Answers (1)

jezrael
jezrael

Reputation: 862601

First select columns by DataFrame.iloc for positions, subtract, get Series.abs, compare by thresh with inverse opearator like < to >= or > and filter by boolean indexing:

df = df[(df.iloc[:, 0]-df.iloc[:, 1]).abs() >= t]

If need select columns by names, here 0 and 1:

df = df[(df[0]-df[1]).abs() >= t]

Upvotes: 2

Related Questions