Reputation: 33628
I have a 2D numpy array. Some of the values in this array are NaN
. I want to perform certain operations using this array. For example consider the array:
[[ 0. 43. 67. 0. 38.]
[ 100. 86. 96. 100. 94.]
[ 76. 79. 83. 89. 56.]
[ 88. NaN 67. 89. 81.]
[ 94. 79. 67. 89. 69.]
[ 88. 79. 58. 72. 63.]
[ 76. 79. 71. 67. 56.]
[ 71. 71. NaN 56. 100.]]
I am trying to take each row, one at a time, sort it in reversed order to get max 3 values from the row and take their average. The code I tried is:
# nparr is a 2D numpy array
for entry in nparr:
sortedentry = sorted(entry, reverse=True)
highest_3_values = sortedentry[:3]
avg_highest_3 = float(sum(highest_3_values)) / 3
This does not work for rows containing NaN
. My question is, is there a quick way to convert all NaN
values to zero in the 2D numpy array so that I have no problems with sorting and other things I am trying to do.
Upvotes: 135
Views: 431546
Reputation: 238209
This should work:
from numpy import *
a = array([[1, 2, 3], [0, 3, NaN]])
where_are_NaNs = isnan(a)
a[where_are_NaNs] = 0
In the above case where_are_NaNs is:
In [12]: where_are_NaNs
Out[12]:
array([[False, False, False],
[False, False, True]], dtype=bool)
A complement about efficiency. The examples below were run with numpy 1.21.2
>>> aa = np.random.random(1_000_000)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit a[np.isnan(a)] = 0
536 µs ± 8.11 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.where(np.isnan(a), 0, a)
2.38 ms ± 27.8 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.nan_to_num(a, copy=True)
8.11 ms ± 401 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
>>> a = np.where(aa < 0.15, np.nan, aa)
>>> %timeit np.nan_to_num(a, copy=False)
3.8 ms ± 70.2 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In consequence a[np.isnan(a)] = 0
is faster.
Upvotes: 149
Reputation: 31672
You could use np.where
to find where you have NaN
:
import numpy as np
a = np.array([[ 0, 43, 67, 0, 38],
[ 100, 86, 96, 100, 94],
[ 76, 79, 83, 89, 56],
[ 88, np.nan, 67, 89, 81],
[ 94, 79, 67, 89, 69],
[ 88, 79, 58, 72, 63],
[ 76, 79, 71, 67, 56],
[ 71, 71, np.nan, 56, 100]])
b = np.where(np.isnan(a), 0, a)
In [20]: b
Out[20]:
array([[ 0., 43., 67., 0., 38.],
[ 100., 86., 96., 100., 94.],
[ 76., 79., 83., 89., 56.],
[ 88., 0., 67., 89., 81.],
[ 94., 79., 67., 89., 69.],
[ 88., 79., 58., 72., 63.],
[ 76., 79., 71., 67., 56.],
[ 71., 71., 0., 56., 100.]])
Upvotes: 29
Reputation: 791
You can use lambda function, an example for 1D array:
import numpy as np
a = [np.nan, 2, 3]
map(lambda v:0 if np.isnan(v) == True else v, a)
This will give you the result:
[0, 2, 3]
Upvotes: -1
Reputation: 3272
You can use numpy.nan_to_num :
numpy.nan_to_num(x) : Replace nan with zero and inf with finite numbers.
Example (see doc) :
>>> np.set_printoptions(precision=8)
>>> x = np.array([np.inf, -np.inf, np.nan, -128, 128])
>>> np.nan_to_num(x)
array([ 1.79769313e+308, -1.79769313e+308, 0.00000000e+000,
-1.28000000e+002, 1.28000000e+002])
Upvotes: 5
Reputation: 38177
A code example for drake's answer to use nan_to_num
:
>>> import numpy as np
>>> A = np.array([[1, 2, 3], [0, 3, np.NaN]])
>>> A = np.nan_to_num(A)
>>> A
array([[ 1., 2., 3.],
[ 0., 3., 0.]])
Upvotes: 17
Reputation: 3277
nan is never equal to nan
if z!=z:z=0
so for a 2D array
for entry in nparr:
if entry!=entry:entry=0
Upvotes: 1
Reputation: 56841
For your purposes, if all the items are stored as str
and you just use sorted as you are using and then check for the first element and replace it with '0'
>>> l1 = ['88','NaN','67','89','81']
>>> n = sorted(l1,reverse=True)
['NaN', '89', '88', '81', '67']
>>> import math
>>> if math.isnan(float(n[0])):
... n[0] = '0'
...
>>> n
['0', '89', '88', '81', '67']
Upvotes: -9
Reputation: 43620
Where A
is your 2D array:
import numpy as np
A[np.isnan(A)] = 0
The function isnan
produces a bool array indicating where the NaN
values are. A boolean array can by used to index an array of the same shape. Think of it like a mask.
Upvotes: 202