Anthony Brenelière
Anthony Brenelière

Reputation: 63530

How to filter a pandas's DataFrame without referencing columns?

In a pandas's Dataframe, I look for a way to remove all rows that contain a False value.

I need to do it for any DataFrame, that means I don't know the name of columns and I cannot reference them.

For example:

df = pd.DataFrame( { 'a': np.random.random_integers(0, 10, 10), 'b': np.random.random_integers(0, 10, 10) } )

# filter without referencing columns:
print( df[ df % 2 == 0] )

# filter with column referencing :
print( df[ (df.a % 2 == 0) & (df.b % 2 == 0)] )

..produces the result:

      a     b
0   NaN   NaN
1   NaN   6.0
2   4.0   NaN
3   8.0  10.0
4  10.0   NaN
5   4.0   NaN
6   NaN   2.0
7   NaN   NaN
8   6.0   NaN
9   0.0   NaN

   a   b
3  8  10

The goal is to have the result filtered (like the second output), but without referencing columns, in order to enable a filter that does not depend on a specific DataFrame.

With the same code:

df = pd.DataFrame( { 'Nantes': np.random.random_integers(0, 10, 10), 'Paris': np.random.random_integers(0, 10, 10) } )

would produce

   Nantes   Paris
3  8        10

Upvotes: 1

Views: 229

Answers (1)

jezrael
jezrael

Reputation: 863031

Add DataFrame.all on axis=1 to return True if a condition returns True for all columns:

np.random.seed(2019)
df = pd.DataFrame( { 'a': np.random.random_integers(0, 10, 10), 
                     'b': np.random.random_integers(0, 10, 10) } )

print ((df % 2 == 0))
       a      b
0   True   True
1   True  False
2  False  False
3   True   True
4   True   True
5   True  False
6   True  False
7   True   True
8   True  False
9  False   True

print (df[(df % 2 == 0).all(axis=1)])
   a  b
0  8  8
3  8  0
4  6  2
7  0  8

print( df[ (df.a % 2 == 0) & (df.b % 2 == 0)] )
   a  b
0  8  8
3  8  0
4  6  2
7  0  8

Upvotes: 2

Related Questions