Why do `max` and `min` have such strange behavior with numpy.nan?

Question

I accidentally stumbled upon some strange behavior with max, min and numpy.nan and I'm curious about what's going on under the hood.

Consider the following code run in python3:

import numpy as np

max(np.nan, 0)     # outputs nan 
max(np.nan, 10000) # outputs nan
max(0, np.nan)     # outputs 0
max(10000, np.nan) # outputs 10000

I've played around with a number of values, and it seems that the first value given is always what's returned. The same behavior can be observed with min. I would have expected the output to consistently be nan, or even to throw an error, but this is quite unexpected. Math.nan does the same thing.

I'm very curious about this behavior -- does anyone have any ideas?

Prune · Accepted Answer

Write your own version of max. Remember that NaN will cause any greater, equal, or less comparison to return False. For instance,

def my_max(iter):
    result = iter[0]
    for val in iter[1:]:
        if result < val:
            result = val
    return result

When you begin with a number, the comparison fails, and that number becomes the result. When you start with nan, any comparison fails, and the result is stuck at that initial nan value.

It's not always the first value, just what you get with the above mechanics. For instance:

>>> nan = numpy.nan
>>> max([7, nan, 15, nan, 5])
15
>>> max([nan, 7, nan, 15, nan, 5])
nan

Why do `max` and `min` have such strange behavior with numpy.nan?

Answers (2)

Related Questions