Jackie
Jackie

Reputation: 41

Why is max and min of numpy array nan?

What could be the reason, why the max and min of my numpy array is nan? I checked my array with:

for i in range(data[0]):
    if data[i] == numpy.nan:
        print("nan")    

And there is no nan in my data. Is my search wrong? If not: What could be the reason for max and min being nan?

Upvotes: 3

Views: 3183

Answers (4)

sameer_nubia
sameer_nubia

Reputation: 811

Easy use np.nanmax(variable_name) and np.nanmin(variable_name)

import numpy as np
z=np.arange(10,20)
z=np.where(z<15,np.nan,z)#Making below 15 z value as nan.
print(z)
print("z max value excluding nan :",np.nanmax(z))
print("z min value excluding nan :",np.nanmin(z))

Upvotes: 0

Niko Fohr
Niko Fohr

Reputation: 33740

The reason is that np.nan == x is always False, even when x is np.nan . This is aligned with the NaN definition in Wikipedia.

Check yourself:

In [4]: import numpy as np

In [5]: np.nan == np.nan
Out[5]: False

If you want to check if a number x is np.nan, you must use

np.isnan(x)

If you want to get max/min of an np.array with nan's, use np.nanmax()/ np.nanmin():

minval = np.nanmin(data)

Upvotes: 1

Valdi_Bo
Valdi_Bo

Reputation: 30971

Balaji Ambresh showed precisely how to find min / max even if the source array contains NaN, there is nothing to add on this matter.

But your code sample contains also other flaws that deserve to be pointed out.

  1. Your loop contains for i in range(data[0]):. You probably wanted to execute this loop for each element of data, but your loop will be executed as many times as the value of the initial element of data.

    Variations:

    • If it is e.g. 1, it will be executed only once.
    • If it is 0 or negative, it will not be executed at all.
    • If it is >= than the size of data, IndexError exception will be raised.
    • If your array contains at least 1 NaN, then the whole array is of float type (NaN is a special case of float) and you get TypeError exception: 'numpy.float64' object cannot be interpreted as an integer.

    Remedium (one of possible variants): This loop should start with for elem in data: and the code inside should use elem as the current element of data.

  2. The next line contains if data[i] == numpy.nan:. Even if you corrected it to if elem == np.nan:, the code inside the if block will never be executed. The reason is that np.nan is by definition not equal to any other value, even it this other value is another np.nan.

    Remedium: Change to if np.isnan(elem): (Balaji wrote in his comment how to change your code, I added why).

And finally: How to check quickly an array for NaNs:

  1. To get a detailed list, whether each element is NaN, run np.isnan(data) and you will get a bool array.

  2. To get a single answer, whether data contains at least one NaN, no matter where, run np.isnan(data).any().

This code is shorter and runs significantly faster.

Upvotes: 1

Balaji Ambresh
Balaji Ambresh

Reputation: 5037

Here you go:

import numpy as np

a = np.array([1, 2, 3, np.nan, 4])

print(f'a.max() = {a.max()}')
print(f'np.nanmax(a) = {np.nanmax(a)}')

print(f'a.min() = {a.min()}')
print(f'np.nanmin(a) = {np.nanmin(a)}')

Output:

a.max() = nan
np.nanmax(a) = 4.0
a.min() = nan
np.nanmin(a) = 1.0

Upvotes: 6

Related Questions