Reputation: 101
I don't quite understand the handling of NAN in the maximum_filter function. I expected either the NANs to be ignored or better if a nan appears anywhere in the kernel the result to be a NAN. Instead NANs seem to be treated differently depending on the place of appearance.
The problem seems to be a bit similar to Scipy maximum_filter is crazy
Here some example code, done with scipy version 0.19.1:
import numpy as np
import scipy.ndimage.filters
a = np.array([[ 0, 0., 1., 2., 3., 4.],
[ 0., np.nan, 1., 2., 3., 2.],
[ 0., 0., 1., 2., 3., 4.],
[ 1., 0., 1., 2., 3., 4.]])
b = np.array([[np.nan, 0., 1., 2., 3., 4.],
[ 0., 0, 1., 2., 3., 2.],
[ 0., 0., 1., 2., 3., 4.],
[ 1., 0., 1., 2., 3., 4.]])
c = np.array([[np.nan, 0., 1., 2., 3., 4.],
[ 0., np.nan, 1., 2., 3., 2.],
[ 0., 0., 1., 2., 3., 4.],
[ 1., 0., 1., 2., 3., 4.]])
print(scipy.ndimage.filters.maximum_filter(a, size=3))
print(scipy.ndimage.filters.maximum_filter(b, size=3))
print(scipy.ndimage.filters.maximum_filter(c, size=3))
Giving an output of
[[ 0. 1. 2. 3. 4. 4.]
[ 0. 1. 2. 3. 4. 4.]
[ 1. 1. 2. 3. 4. 4.]
[ 1. 1. 2. 3. 4. 4.]]
[[ nan nan 2. 3. 4. 4.]
[ nan nan 2. 3. 4. 4.]
[ 1. 1. 2. 3. 4. 4.]
[ 1. 1. 2. 3. 4. 4.]]
[[ nan nan 2. 3. 4. 4.]
[ nan nan 2. 3. 4. 4.]
[ 1. 1. 2. 3. 4. 4.]
[ 1. 1. 2. 3. 4. 4.]]
In "a" NAN is ignored, in "b" it seems that every comparison with NAN results in NAN and "c" gives exactly the same result as "b".
Questions:
1. Is this a bug or can the behavior be somehow justified?
2. How can I get the "b" result for NANs not in the upper left corner?
Upvotes: 1
Views: 1985
Reputation: 152677
nan
s with functions that don't explicitly state they have special nan
handling. It's not-a-number, so don't use it where a number is expected!SciPys maximum_filter
is one of them.
I thought about going into the SciPy internals but since these are implementation details and might change without notice or deprecation it's probably not worth it. Also it will be really convoluted since it depends on the order of the comparisons and the comparison itself and also how the function does the maximum_filter
(I suspect they use a heap-based running maximum filter).
However you can of course get the desired results. If you want NaN
s to be ignored you can replace them (for maximum_filter
) with -np.inf
and if you want them to be to be "propagated" then you could use for example a generic filter:
def maximum_filter_ignore_nan(array, *args, **kwargs):
nans = np.isnan(array)
replaced = np.where(nans, -np.inf, array)
return scipy.ndimage.filters.maximum_filter(replaced, *args, **kwargs)
def maximum_filter_propagate_nan(array, *args, **kwargs):
def inner(array):
if np.isnan(array).any():
return np.nan
return array.max()
return scipy.ndimage.generic_filter(arr, inner, size=3)
print(maximum_filter_ignore_nan(a, size=3))
print(maximum_filter_ignore_nan(b, size=3))
print(maximum_filter_ignore_nan(c, size=3))
print(maximum_filter_propagate_nan(a, size=3))
print(maximum_filter_propagate_nan(b, size=3))
print(maximum_filter_propagate_nan(c, size=3))
Upvotes: 4