B. Biehler
B. Biehler

Reputation: 101

NAN in scipy.ndimage.filters.maximum_filter

I don't quite understand the handling of NAN in the maximum_filter function. I expected either the NANs to be ignored or better if a nan appears anywhere in the kernel the result to be a NAN. Instead NANs seem to be treated differently depending on the place of appearance.

The problem seems to be a bit similar to Scipy maximum_filter is crazy

Here some example code, done with scipy version 0.19.1:

    import numpy as np
    import scipy.ndimage.filters

    a = np.array([[    0,     0.,   1.,   2.,   3.,   4.],
                  [    0., np.nan,   1.,   2.,   3.,   2.],
                  [    0.,     0.,   1.,   2.,   3.,   4.],
                  [    1.,     0.,   1.,   2.,   3.,   4.]])

    b = np.array([[np.nan,     0.,   1.,   2.,   3.,   4.],
                  [    0.,     0,   1.,   2.,   3.,   2.],
                  [    0.,     0.,   1.,   2.,   3.,   4.],
                  [    1.,     0.,   1.,   2.,   3.,   4.]])

    c = np.array([[np.nan,     0.,   1.,   2.,   3.,   4.],
                  [    0., np.nan,   1.,   2.,   3.,   2.],
                  [    0.,     0.,   1.,   2.,   3.,   4.],
                  [    1.,     0.,   1.,   2.,   3.,   4.]])


    print(scipy.ndimage.filters.maximum_filter(a, size=3))
    print(scipy.ndimage.filters.maximum_filter(b, size=3))
    print(scipy.ndimage.filters.maximum_filter(c, size=3))

Giving an output of

    [[ 0.  1.  2.  3.  4.  4.]
     [ 0.  1.  2.  3.  4.  4.]
     [ 1.  1.  2.  3.  4.  4.]
     [ 1.  1.  2.  3.  4.  4.]]
    [[ nan  nan   2.   3.   4.   4.]
     [ nan  nan   2.   3.   4.   4.]
     [  1.   1.   2.   3.   4.   4.]
     [  1.   1.   2.   3.   4.   4.]]
    [[ nan  nan   2.   3.   4.   4.]
     [ nan  nan   2.   3.   4.   4.]
     [  1.   1.   2.   3.   4.   4.]
     [  1.   1.   2.   3.   4.   4.]]

In "a" NAN is ignored, in "b" it seems that every comparison with NAN results in NAN and "c" gives exactly the same result as "b".

Questions:
1. Is this a bug or can the behavior be somehow justified?
2. How can I get the "b" result for NANs not in the upper left corner?

Upvotes: 1

Views: 1985

Answers (1)

MSeifert
MSeifert

Reputation: 152677

Try to avoid nans with functions that don't explicitly state they have special nan handling. It's not-a-number, so don't use it where a number is expected!

SciPys maximum_filter is one of them.

I thought about going into the SciPy internals but since these are implementation details and might change without notice or deprecation it's probably not worth it. Also it will be really convoluted since it depends on the order of the comparisons and the comparison itself and also how the function does the maximum_filter (I suspect they use a heap-based running maximum filter).

However you can of course get the desired results. If you want NaNs to be ignored you can replace them (for maximum_filter) with -np.inf and if you want them to be to be "propagated" then you could use for example a generic filter:

def maximum_filter_ignore_nan(array, *args, **kwargs):
    nans = np.isnan(array)
    replaced = np.where(nans, -np.inf, array)
    return scipy.ndimage.filters.maximum_filter(replaced, *args, **kwargs)


def maximum_filter_propagate_nan(array, *args, **kwargs):
    def inner(array):
        if np.isnan(array).any():
            return np.nan
        return array.max()
    return scipy.ndimage.generic_filter(arr, inner, size=3)

print(maximum_filter_ignore_nan(a, size=3))
print(maximum_filter_ignore_nan(b, size=3))
print(maximum_filter_ignore_nan(c, size=3))

print(maximum_filter_propagate_nan(a, size=3))
print(maximum_filter_propagate_nan(b, size=3))
print(maximum_filter_propagate_nan(c, size=3))

Upvotes: 4

Related Questions