Reputation: 7237
Having a defined interval limits of valid values, all the pandas data frame column values out of it should be set to a given value, f.e. NaN
. The values defining limits and data frame contents can be assumed to be of numerical type.
Having the following limits and data frame:
min = 2
max = 7
df = pd.DataFrame({'a': [5, 1, 7, 22],'b': [12, 3 , 10, 9]})
a b
0 5 12
1 1 3
2 7 10
3 22 9
Setting the limit on column a
would result in:
a b
0 5 12
1 NaN 3
2 7 10
3 NaN 9
Upvotes: 2
Views: 124
Reputation: 106
you can use .loc
with between
also
import pandas as pd
import numpy as np
df = pd.DataFrame({'a': [5, 1, 7, 22],'b': [12, 3 , 10, 9]})
min = 2
max = 7
df.loc[~df.a.between(min,max), 'a'] = np.nan
Upvotes: 1
Reputation: 323366
Using where
with between
df.a=df.a.where(df.a.between(min,max),np.nan)
df
Out[146]:
a b
0 5.0 12
1 NaN 3
2 7.0 10
3 NaN 9
Or clip
df.a.clip(min,max)
Out[147]:
0 5.0
1 NaN
2 7.0
3 NaN
Name: a, dtype: float64
Upvotes: 6