user88484
user88484

Reputation: 1557

Majority filter Numpy array

I have a numpy ndarray comprises of zeros, ones and NaNs. I would like to use a majority filter on that array, meaning that I would like to set a kernel window (e.g., 3X3 cells) that will go over the array and will change the value of the cell in the center to the value that occur the most in its neighbors. This filter should sustain two constrains, it should ignore NaNs and if the value of the center cell is one, then it should keep it one.

Here is a small example of what I'm looking for: input array:

array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

Apply majority filter output array:

array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.,  1.]])

I was looking at scipy filters but could not find anything adequate. I thought to build a generic convolved filter, but I'm not sure how to do that for majority purpose. It feels that this is quit basic filter that should be out there, but I can't seem to find it.

Upvotes: 3

Views: 2619

Answers (2)

Divakar
Divakar

Reputation: 221624

Here's one vectorized idea based on convolution. Given those constraints, it seems we need to edit only the 0s places. For each sliding window, get count of 1s and then non-NaNs, which decides threshold for deciding if 1s are majority. If they are, set those places that are also 0s as 1s.

The implementation would look something like this -

from scipy.signal import convolve2d

def fill0s(a):
    # Mask of NaNs
    nan_mask = np.isnan(a)

    # Convolution kernel
    k = np.ones((3,3),dtype=int)

    # Get count of 1s for each kernel window
    ones_count = convolve2d(np.where(nan_mask,0,a),k,'same')

    # Get count of elements per window and hence non NaNs count
    n_elem = convolve2d(np.ones(a.shape,dtype=int),k,'same')
    nonNaNs_count = n_elem - convolve2d(nan_mask,k,'same')

    # Compare 1s count against half of nonNaNs_count for the first mask.
    # This tells us if 1s are majority among non-NaNs population.
    # Second mask would be of 0s in a. Use Combined mask to set 1s.
    final_mask = (ones_count >= nonNaNs_count/2.0) & (a==0)
    return np.where(final_mask,1,a)

Note that since, we are performing uniform filtering with that kind of 1s kernel, we can also use uniform_filter.

Sample run -

In [232]: a
Out[232]: 
array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])

In [233]: fill0s(a)
Out[233]: 
array([[ 1.,  1.,  1.,  1.,  1.],
       [ 1.,  1., nan,  1.,  1.],
       [nan,  1.,  1.,  1.,  1.],
       [ 0.,  0.,  0.,  1.,  1.]])

Upvotes: 2

Aleksk89
Aleksk89

Reputation: 93

Try the following code:

Note that the result is a bit different than yours, due to the behavior of numpy.argmax when multiple indices has the same maximum value (You might want to write your own argmax function... x=np.argwhere(x==np.max(x))[:,0] gives all the indices instead of only the first)

import numpy as np

def block_fn(x,center_val):

    unique_elements, counts_elements = np.unique(x.ravel(), return_counts=True)

    if np.isnan(center_val):
        return np.nan
    elif center_val == 1:
        return 1.0
    else:
        return unique_elements[np.argmax(counts_elements)]



def majority_filter(x,block_size = (3,3)):

    #Odd block sizes only  ( ? )
    assert(block_size[0]%2 != 0 and block_size[1]%2 !=0)

    yy =int((block_size[0]-1)/2)
    xx =int((block_size[1]-1)/2)


    output= np.zeros_like(x)
    for i in range(0,x.shape[0]):
        miny,maxy = max(0,i-yy),min(x.shape[0]-1,i+yy)

        for j in range(0,x.shape[1]):
            minx,maxx = max(0,j-xx),min(x.shape[1]-1,j+xx)

            #Extract block to take majority filter over
            block=x[miny:maxy+1,minx:maxx+1]

            output[i,j] = block_fn(block,center_val=x[i,j])


    return output


inp=np.array([[ 1.,  1.,  1.,  0.,  0.],
       [ 1.,  1., np.nan,  1.,  1.],
       [np.nan,  1.,  1.,  0.,  1.],
       [ 0.,  0.,  0.,  0.,  1.]])


print(majority_filter(inp))

Upvotes: 0

Related Questions