Laxmikant
Laxmikant

Reputation: 2216

Multiple conditions on pandas dataframe

I have a list of conditions to be run on the dataset to sort huge data.

df = A Huge_dataframe. eg.

  Index D1  D2  D3   D5      D6
    0   8   5   0  False   True
    1  45  35   0   True  False
    2  35  10   1  False   True
    3  40   5   2   True  False
    4  12  10   5  False  False
    5  18  15  13  False   True
    6  25  15   5   True  False
    7  35  10  11  False   True
    8  95  50   0  False  False

I have to sort above df based on given orders:

orders = [[A, B],[D, ~E, B], [~C, ~A], [~C, A]...] 
#(where A, B, C , D, E are the conditions) 

eg.

A = df['D1'].le(50)
B = df['D2'].ge(5)
C = df['D3'].ne(0)
D = df['D1'].ne(False)
E = df['D1'].ne(True)
# In the real scenario, I have 64 such conditions to be run on 5 million records. 

eg. I have to run all these conditions to get the resultant output.

What is the easiest way to achieve the following task, to order them using for loop or map or .apply?

  df = df.loc[A & B]
  df = df.loc[D & ~E & B]
  df = df.loc[~C & ~A]
  df = df.loc[~C & A]

Resultant df would be my expected output.

Here I am more interested in knowing, how would you use loop or map or .apply, If I want to run multiple conditions which are stored in a list. Not the resultant output.

such as:

for i in orders:
   df = df[all(i)] # I am not able to implement this logic for each order

Upvotes: 0

Views: 173

Answers (1)

Quang Hoang
Quang Hoang

Reputation: 150755

You are looking for bitwise and all the elements inside orders. In which case:

df = df[np.concatenate(orders).all(0)]

Upvotes: 1

Related Questions