Salvador Dali
Salvador Dali

Reputation: 222471

ndimage.generic_function on 3d array

I need to calculate a most frequent element in a matrix based on the neighbor values and itself. I found a generic_filter function which I used to calculate what I wanted. So here is how I can do this for a 2d array

arr = np.array([
    [1, 2, 4],
    [5, 6, 7],
    [2, 4, 4]
])

def most_frequent(arr):
    def most_frequent(val):
        return Counter(val).most_common(1)[0][0]

    footprint = [[1, 1, 1], [1, 1, 1], [1, 1, 1]]
    return ndimage.generic_filter(arr, most_frequent, footprint=footprint, mode='constant')

print most_frequent(arr)

This returns me

[[0 0 0]
 [0 4 0]
 [0 0 0]]

Ignore the elements on the edge. As you see the middle element is 4 because this is a most frequent element among the neighbors and the value.


The big problem is that I need to do the same thing for a 3d matrix. So for a matrix like this

arr = np.array([
    [[1, 1], [2, 2], [4, 4]],
    [[5, 5], [6, 6], [7, 7]],
    [[2, 2], [4, 4], [4, 4]]
])

I expect to get [0, 0] everywhere and [4, 4] in the middle. This fails with RuntimeError('filter footprint array has incorrect shape.'). The worse thing that I have doubts that I can use generic_filter here because the docs say:

cval : scalar, optional Value to fill past edges of input if mode is ‘constant’.

So how can I solve my problem?

Upvotes: 0

Views: 418

Answers (2)

B. M.
B. M.

Reputation: 18628

Here a fully vectored solution.

First make flattened neighborhoods:

(n,m,_)=M.shape 
(sn,sm,s2)=M.strides 
newshape=(n-2,m-2,9,2)
newstrides=(sn,sm,2*s2,s2)
neighborhoods=np.lib.stride_tricks.as_strided(M,newshape,newstrides)
"""
array([[[[1, 1],
         [2, 2],
         [4, 1],
         [1, 1],
         [5, 5],
         [6, 6],
         [7, 7],
         [2, 3],
         [2, 2]],

        [[2, 2],
         [4, 1],
         [1, 1],
         [5, 5],
         [6, 6],
         [7, 7],
         [2, 3],
         [2, 2],
         [4, 1]]]])
     """

Then you have to pack the two components to use np.unique which work with 1D arrays. assuming M.dtype is int32, you can do that by view:

packed_neighborhoods=np.ascontiguousarray(neighborhoods).view(int64)
In [5]: packed_neighborhoods.shape
Out[5]: (1, 2, 9, 1)

Now we define a function that take a 1D array and find the indice of the most frequent , based on np.unique:

def mostfreq(arr):
     _,index,counts=unique(arr, return_index=True, return_counts=True)
     return index[counts.argmax()]

Apply it on the good axis:

ind2=apply_along_axis(mostfreq,2,packed_neighborhoods).squeeze()

And there is the result, including other indices.

ind0,ind1=indices(neighborhoods.shape[:2])
print(neighborhoods[ind0,ind1,ind2])  
"""
[[[1 1]
  [4 1]]]
"""

But your solution has same performance at the moment ;(.

Upvotes: 1

Salvador Dali
Salvador Dali

Reputation: 222471

One way I have found how to achieve this is by doing something like

def most_frequent(M):
    x, y, _ = arr.shape
    res = np.zeros((x - 2, y - 2, 2))
    for i in xrange(1, x - 1):
        for j in xrange(1, y - 1):
            neighbors = [M[i - 1, j - 1], M[i - 1, j], M[i - 1, j + 1], M[i, j - 1], M[i, j], M[i, j + 1], M[i + 1, j - 1], M[i + 1, j], M[i + 1, j + 1]]
            res[i - 1, j - 1] = Counter([tuple(_) for _ in neighbors]).most_common(1)[0][0]

    return res

Still looking for a better solution (the one that does not involve my 2 loops).

Upvotes: 0

Related Questions