aeiou
aeiou

Reputation: 447

Check if all dataframe row values are in specified range

How to check for each row in dataframe if all its values are in specified range?

import pandas as pd

new = pd.DataFrame({'a': [1,2,3], 'b': [-5,-8,-3], 'c': [20,0,0]})

For instance range <-5, 5>:

>>    a  b   c
>> 0  1 -5  20  # abs(20) > 5, hence no
>> 1  2 -8   0  # abs(-8) > 5, hence no
>> 2  3 -3   0  # abs(-3) <= 5, hence yes

Solution with iteration

print(['no' if any(abs(i) > 5 for i in a) else 'yes' for _, a in new.iterrows()])

>> ['no', 'no', 'yes']

Upvotes: 0

Views: 306

Answers (2)

nikblg
nikblg

Reputation: 31

For operations with DataFrames of numbers you should use numpy.

import pandas as pd
import numpy as np


df = pd.DataFrame({'a': [1, 2, 3], 'b': [-5, -8, -3], 'c': [20, 0, 0]})


df_ndarray = df.values

bin_mask = np.where((df_ndarray > 5) | (df_ndarray < -5), 1, 0)

res = np.equal(bin_mask.sum(axis=0), np.arange(len(df.columns)))

Upvotes: 1

BeRT2me
BeRT2me

Reputation: 13251

Doing:

out = (df.gt(-5) & df.lt(5)).all(axis=1)
# Or if you just want to supply a single value:
# df.abs().lt(5).all(axis=1)
print(out)

Output:

0    False
1    False
2     True
dtype: bool

You could add this as a new column, and change things to no/yes if desired (which imo is a terrible idea):

df['valid'] = np.where(df.abs().lt(5).all(1), 'yes', 'no')
print(df)

# Output:

   a  b   c valid
0  1 -5  20    no
1  2 -8   0    no
2  3 -3   0   yes

Upvotes: 3

Related Questions