How to replace all values in a dataframe based on two conditions

Question

I'd like to replace all values in a df that are between -0.5 and 0.5 with NaNs.

For just the latter condition this solution works nicely:

df[df < 0.5] = np.nan

I can't, however, figure out how to add a second condition like:

df[-0.5 < df < 0.5] = np.nan

Any help would be greatly appreciated!

Thanks.

sacuL · Accepted Answer

All you need is to index based on two conditions, df < 0.5 and df > -0.5, such as this:

df[(df < 0.5) & (df > -0.5)] = np.nan

for instance:

import pandas as pd
import numpy as np
# Example df
df = pd.DataFrame(data={'data1':2*np.random.randn(100),
                    'data2':2*np.random.randn(100)})

# Show example with all values as original
>>> df.head(10)
      data1     data2
0 -0.113909  3.625936
1 -2.795349 -1.362933
2 -3.750103  2.686047
3  3.286711 -2.937002
4 -0.279161 -2.255135
5 -0.394181  3.937575
6 -1.166115  0.776880
7 -2.750386  0.681216
8  1.375598 -1.070675
9 -0.871180 -0.122937


df[(df < 0.5) & (df > -0.5)] = np.nan

# Show df with NaN when between -0.5 and 0.5
>>> df.head(10)
      data1     data2
0       NaN  3.625936
1 -2.795349 -1.362933
2 -3.750103  2.686047
3  3.286711 -2.937002
4       NaN -2.255135
5       NaN  3.937575
6 -1.166115  0.776880
7 -2.750386  0.681216
8  1.375598 -1.070675
9 -0.871180       NaN

How to replace all values in a dataframe based on two conditions

Answers (1)

Related Questions