Reputation: 3180
I feel like this probably has a simple solution, I just can't figure it out.
I have a Pandas DF similar to this MWE:
In [92]: test_df = pd.DataFrame({'A': [1,2,3,4,5,6,7,8,9], 'B':[9,8,7,6,5,4,3,2,1]})
In [93]: test_df
Out[93]:
A B
0 1 9
1 2 8
2 3 7
3 4 6
4 5 5
5 6 4
6 7 3
7 8 2
8 9 1
What I want is to set all values in that df that are less than 4 to be np.nan
. I can get a df of booleans for this criteria:
In [94]: test_df < 4
Out[94]:
A B
0 True False
1 True False
2 True False
3 False False
4 False False
5 False False
6 False True
7 False True
8 False True
But I don't know the final step to make those True values np.nan
. I thought this could be achieved with test_df.loc
but I wasn't successful in my attempts.
Upvotes: 6
Views: 10635
Reputation: 210832
You can assign NaN
using boolean indexing:
In [25]: test_df[test_df < 4] = np.nan
In [26]: test_df
Out[26]:
A B
0 NaN 9.0
1 NaN 8.0
2 NaN 7.0
3 4.0 6.0
4 5.0 5.0
5 6.0 4.0
6 7.0 NaN
7 8.0 NaN
8 9.0 NaN
alternative solution with "negated" condition:
In [43]: test_df.where(test_df >= 4)
Out[43]:
A B
0 NaN 9.0
1 NaN 8.0
2 NaN 7.0
3 4.0 6.0
4 5.0 5.0
5 6.0 4.0
6 7.0 NaN
7 8.0 NaN
8 9.0 NaN
or:
In [47]: test_df.where(~(test_df < 4))
Out[47]:
A B
0 NaN 9.0
1 NaN 8.0
2 NaN 7.0
3 4.0 6.0
4 5.0 5.0
5 6.0 4.0
6 7.0 NaN
7 8.0 NaN
8 9.0 NaN
Upvotes: 3
Reputation: 862611
Use DataFrame.mask
, by default True
values of boolean mask
are replaced by NaN
:
print (test_df.mask(test_df < 4))
A B
0 NaN 9.0
1 NaN 8.0
2 NaN 7.0
3 4.0 6.0
4 5.0 5.0
5 6.0 4.0
6 7.0 NaN
7 8.0 NaN
8 9.0 NaN
Another solution is invert condition and simple assign:
test_df = test_df[test_df >= 4]
print (test_df)
A B
0 NaN 9.0
1 NaN 8.0
2 NaN 7.0
3 4.0 6.0
4 5.0 5.0
5 6.0 4.0
6 7.0 NaN
7 8.0 NaN
8 9.0 NaN
Upvotes: 4