Majority filter Numpy array

Question

I have a numpy ndarray comprises of zeros, ones and NaNs. I would like to use a majority filter on that array, meaning that I would like to set a kernel window (e.g., 3X3 cells) that will go over the array and will change the value of the cell in the center to the value that occur the most in its neighbors. This filter should sustain two constrains, it should ignore NaNs and if the value of the center cell is one, then it should keep it one.

Here is a small example of what I'm looking for: input array:

array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

Apply majority filter output array:

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.,  1.]])

I was looking at scipy filters but could not find anything adequate. I thought to build a generic convolved filter, but I'm not sure how to do that for majority purpose. It feels that this is quit basic filter that should be out there, but I can't seem to find it.

Divakar · Accepted Answer

Here's one vectorized idea based on convolution. Given those constraints, it seems we need to edit only the 0s places. For each sliding window, get count of 1s and then non-NaNs, which decides threshold for deciding if 1s are majority. If they are, set those places that are also 0s as 1s.

The implementation would look something like this -

from scipy.signal import convolve2d

def fill0s(a):
    # Mask of NaNs
    nan_mask = np.isnan(a)

    # Convolution kernel
    k = np.ones((3,3),dtype=int)

    # Get count of 1s for each kernel window
    ones_count = convolve2d(np.where(nan_mask,0,a),k,'same')

    # Get count of elements per window and hence non NaNs count
    n_elem = convolve2d(np.ones(a.shape,dtype=int),k,'same')
    nonNaNs_count = n_elem - convolve2d(nan_mask,k,'same')

    # Compare 1s count against half of nonNaNs_count for the first mask.
    # This tells us if 1s are majority among non-NaNs population.
    # Second mask would be of 0s in a. Use Combined mask to set 1s.
    final_mask = (ones_count >= nonNaNs_count/2.0) & (a==0)
    return np.where(final_mask,1,a)

Note that since, we are performing uniform filtering with that kind of 1s kernel, we can also use uniform_filter.

Sample run -

In [232]: a
Out[232]: 
array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

In [233]: fill0s(a)
Out[233]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.,  1.]])

Majority filter Numpy array

Answers (2)

Related Questions