Nae
Nae

Reputation: 15315

How to check whether all values in a column satisfy a condition in Data Frame?

How can I check if all values under col1 satisfy a condition such as > 2?

import pandas as pd

d = [
    {'col1': 3, 'col2': 'wasteful'},
    {'col1': 0, 'col2': 'hardly'},
    ]

df = pd.DataFrame(d)

I could go

if all(col1 > 2 for i, col1, col2 in df.itertuples()):
    #do stuff

but is there a more readable, faster and/or has less memory footprint way?

Upvotes: 19

Views: 26085

Answers (3)

Aroc
Aroc

Reputation: 1263

A further option is the application of lambda-Functions

import pandas as pd
df = pd.DataFrame(
    [{'col1': 3, 'col2': 'wasteful'},
    {'col1': 0, 'col2': 'hardly'},
    {'col1': 9, 'col2': 'stuff'}])

print(df['col1'].apply(lambda x: True if ((x>2) & (x<8)) else False))

#result:
#0 True
#1 False
#2 False

Upvotes: 1

Sohaib Farooqi
Sohaib Farooqi

Reputation: 5666

You can also use numpy.where to check if all column of a dataframe satisfies a condition

import numpy as np
import pandas as pd

d = [
  {'col1': 3, 'col2': 'wasteful'},
  {'col1': 0, 'col2': 'hardly'},
]

df = pd.DataFrame(d)
print(all(np.where(df['col1'] > 2, True, False)))
#False

Upvotes: 2

jezrael
jezrael

Reputation: 862406

I think you need create boolean mask and then all for check if all Trues:

print (df['col1'] > 2)
0     True
1    False
Name: col1, dtype: bool

print ((df['col1'] > 2).all())
False

Upvotes: 21

Related Questions