Max Boyar
Max Boyar

Reputation: 65

Is there a way to multiple filter Dataframe in Python-Pandas?

looking for some way to filter my Data frame by few criteria's (Dataframe for example:

id  Arrest  Shift_num  Description
0   True    20         Weapon
1   False   25         unarmed
2   True    30         Weapon 

I would like to get DF with: Description == Weapon and shift_num >= 25 and arrest == True (for example)

after few tries , that was my way, but i think it can be better than this :

arrest=(df.Arrest == True)
shift=(df.Shift_num >= 25)
weap= (df['Description'] == 'weapon')

print(df[arrest & shift & weap])

Thanks in advance :)

Upvotes: 4

Views: 76

Answers (3)

PascalVKooten
PascalVKooten

Reputation: 21461

You can use df.query (a bonus: it uses numexpr which is very optimized!):

import pandas as pd

df = pd.DataFrame({"Arrest": [True, False, True], 
                   "Shift_num": [20, 25, 30], 
                   "Description": ["Weapon", "unarmed", "Weapon"]})

df.query("Arrest & Shift_num >= 25 & Description == 'Weapon'")

Output:

   Arrest  Shift_num Description
2    True         30      Weapon

Some notes:

  • Don't forget to 'quote' strings
  • The variable names that can be used are from the DataFrame scope (without needing to prefix using df)
  • Use ~Arrest when you want NOT arrested
  • You can use @ to refer to a variable in the scope (i.e. not in the df)

I encouraged you to read about numexpr.

Upvotes: 4

DanDy
DanDy

Reputation: 461

What you've got works. Here is a one liner that may be slightly more efficient. Since Arrest is a boolean field, you can evaluate it directly instead of using the == True.

In [5]: df[(df.Description == 'Weapon') & (df.Shift_num >= 25) & (df.Arrest)] 
Out[5]: 
   id  Arrest  Shift_num Description
2   2    True         30      Weapon

Upvotes: 1

julian
julian

Reputation: 471

You can try slicing:

df = pd.DataFrame({'Arrest':[True,False,True],'Shift_num':[20,25,30],'Description':['Weapon','unarmed','Weapon']})

df.loc[(df['Description'] == 'Weapon') & (df['Shift_num'] > 25) & (df['Arrest'] == True)]

Upvotes: 1

Related Questions