Reputation: 65
looking for some way to filter my Data frame by few criteria's (Dataframe for example:
id Arrest Shift_num Description
0 True 20 Weapon
1 False 25 unarmed
2 True 30 Weapon
I would like to get DF with:
Description == Weapon and shift_num >= 25
and arrest == True
(for example)
after few tries , that was my way, but i think it can be better than this :
arrest=(df.Arrest == True)
shift=(df.Shift_num >= 25)
weap= (df['Description'] == 'weapon')
print(df[arrest & shift & weap])
Thanks in advance :)
Upvotes: 4
Views: 76
Reputation: 21461
You can use df.query (a bonus: it uses numexpr
which is very optimized!):
import pandas as pd
df = pd.DataFrame({"Arrest": [True, False, True],
"Shift_num": [20, 25, 30],
"Description": ["Weapon", "unarmed", "Weapon"]})
df.query("Arrest & Shift_num >= 25 & Description == 'Weapon'")
Output:
Arrest Shift_num Description
2 True 30 Weapon
Some notes:
df
)~Arrest
when you want NOT arrested@
to refer to a variable in the scope (i.e. not in the df)I encouraged you to read about numexpr.
Upvotes: 4
Reputation: 461
What you've got works. Here is a one liner that may be slightly more efficient. Since Arrest
is a boolean field, you can evaluate it directly instead of using the == True
.
In [5]: df[(df.Description == 'Weapon') & (df.Shift_num >= 25) & (df.Arrest)]
Out[5]:
id Arrest Shift_num Description
2 2 True 30 Weapon
Upvotes: 1
Reputation: 471
You can try slicing:
df = pd.DataFrame({'Arrest':[True,False,True],'Shift_num':[20,25,30],'Description':['Weapon','unarmed','Weapon']})
df.loc[(df['Description'] == 'Weapon') & (df['Shift_num'] > 25) & (df['Arrest'] == True)]
Upvotes: 1