Reputation: 3657
I have a pandas
DataFrame
df
:
import pandas as pd
data = {"Name": ["AAAA", "BBBB"],
"C1": [25, 12],
"C2": [2, 1],
"C3": [1, 10]}
df = pd.DataFrame(data)
df.set_index("Name")
which looks like this when printed (for reference):
C1 C2 C3
Name
AAAA 25 2 1
BBBB 12 1 10
I would like to choose rows for which C1
, C2
and C3
have values between 0
and 20
.
Can you suggest an elegant way to select those rows?
Upvotes: 42
Views: 148815
Reputation: 571
A more concise df.query
:
df.query("0 <= C1 <= 20 and 0 <= C2 <= 20 and 0 <= C3 <= 20")
or
df.query("0 <= @df <= 20").dropna()
Using @foo
in df.query
refers to the variable foo
in the environment.
Upvotes: 12
Reputation: 758
I like to use df.query() for these kind of things
df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')
Upvotes: 22
Reputation: 393903
Shorter version:
In [65]:
df[(df>=0)&(df<=20)].dropna()
Out[65]:
Name C1 C2 C3
1 BBBB 12 1 10
Upvotes: 27
Reputation: 2145
I think below should do it, but its elegance is up for debate.
new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]
Upvotes: 58