Scratch
Scratch

Reputation: 373

creating a boolean indexing in for loop in pandas

I would like to get a subset of a pandas dataframe with boolean indexing.

The condition I want to test is like (df[var_0] == value_0) & ... & (df[var_n] == value_n) where the number n of variables involved can change. As a result I am not able to write :

df = df[(df[var_0] == value_0) & ... & (df[var_n] == value_n)]

I could do something like :

for k in range(0,n+1) :
    df = df[df[var_k] == value_k]

(with some try catch to make sure it works if the dataframe goes empty), but that does not seems very efficient. Has anyone an idea on how to write that in a clean pandas formulation ?

Upvotes: 0

Views: 1320

Answers (1)

TomAugspurger
TomAugspurger

Reputation: 28946

The isin method should work for you here.

In [7]: df
Out[7]: 
   a  b  c  d  e
0  6  3  1  9  6
1  8  9  5  7  2
2  6  4  7  4  3
3  4  8  0  0  5
4  4  4  2  3  4
5  2  5  9  0  9
6  4  8  2  9  1
7  3  0  8  9  7
8  0  5  9  9  6
9  0  7  8  4  8

[10 rows x 5 columns]

In [8]: vals = {'a': [3], 'b': [0], 'c': [8], 'd': [9], 'e': [7]}

In [9]: df.isin(vals)
Out[9]: 
       a      b      c      d      e
0  False  False  False   True  False
1  False  False  False  False  False
2  False  False  False  False  False
3  False  False  False  False  False
4  False  False  False  False  False
5  False  False  False  False  False
6  False  False  False   True  False
7   True   True   True   True   True
8  False  False  False   True  False
9  False  False   True  False  False

[10 rows x 5 columns]

In [10]: df[df.isin(vals).all(1)]
Out[10]: 
   a  b  c  d  e
7  3  0  8  9  7

[1 rows x 5 columns]

The values in the vals dict need to be a collection, so I put them into length 1 lists. It's possibly that query can do this too.

Upvotes: 3

Related Questions