Reputation: 1423
I accidentally stumbled upon some strange behavior with max
, min
and numpy.nan
and I'm curious about what's going on under the hood.
Consider the following code run in python3:
import numpy as np
max(np.nan, 0) # outputs nan
max(np.nan, 10000) # outputs nan
max(0, np.nan) # outputs 0
max(10000, np.nan) # outputs 10000
I've played around with a number of values, and it seems that the first value given is always what's returned. The same behavior can be observed with min
. I would have expected the output to consistently be nan
, or even to throw an error, but this is quite unexpected. Math.nan
does the same thing.
I'm very curious about this behavior -- does anyone have any ideas?
Upvotes: 3
Views: 296
Reputation: 280227
max
doesn't know anything about floats or NaN. It assumes that there actually is an ordering relationship between the arguments, and it may produce nonsensical results when there is no such relationship, as is the case with NaN.
numpy.maximum
behaves more reasonably:
>>> numpy.maximum(numpy.nan, 1)
nan
>>> numpy.maximum(1, numpy.nan)
nan
Upvotes: 2
Reputation: 77827
Write your own version of max
. Remember that NaN
will cause any greater, equal, or less comparison to return False
. For instance,
def my_max(iter):
result = iter[0]
for val in iter[1:]:
if result < val:
result = val
return result
When you begin with a number, the comparison fails, and that number becomes the result. When you start with nan
, any comparison fails, and the result is stuck at that initial nan
value.
It's not always the first value, just what you get with the above mechanics. For instance:
>>> nan = numpy.nan
>>> max([7, nan, 15, nan, 5])
15
>>> max([nan, 7, nan, 15, nan, 5])
nan
Upvotes: 8