Ssank
Ssank

Reputation: 3657

selecting rows based on multiple column values in pandas dataframe

I have a pandas DataFrame df:

import pandas as pd

data = {"Name": ["AAAA", "BBBB"],
        "C1": [25, 12],
        "C2": [2, 1],
        "C3": [1, 10]}

df = pd.DataFrame(data)
df.set_index("Name")

which looks like this when printed (for reference):

      C1  C2  C3
Name            
AAAA  25   2   1
BBBB  12   1  10

I would like to choose rows for which C1, C2 and C3 have values between 0 and 20.

Can you suggest an elegant way to select those rows?

Upvotes: 42

Views: 148815

Answers (4)

Braham Snyder
Braham Snyder

Reputation: 571

A more concise df.query:

df.query("0 <= C1 <= 20 and 0 <= C2 <= 20 and 0 <= C3 <= 20")

or

df.query("0 <= @df <= 20").dropna()

Using @foo in df.query refers to the variable foo in the environment.

Upvotes: 12

Rob Buckley
Rob Buckley

Reputation: 758

I like to use df.query() for these kind of things

df.query('C1>=0 and C1<=20 and C2>=0 and C2<=20 and C3>=0 and C3<=20')

Upvotes: 22

EdChum
EdChum

Reputation: 393903

Shorter version:

In [65]:

df[(df>=0)&(df<=20)].dropna()
Out[65]:
   Name  C1  C2  C3
1  BBBB  12   1  10

Upvotes: 27

kennes
kennes

Reputation: 2145

I think below should do it, but its elegance is up for debate.

new_df = old_df[((old_df['C1'] > 0) & (old_df['C1'] < 20)) & ((old_df['C2'] > 0) & (old_df['C2'] < 20)) & ((old_df['C3'] > 0) & (old_df['C3'] < 20))]

Upvotes: 58

Related Questions