Vectorized Counting in 2D array python

Question

I am trying to build a function that counts up zeros in a list until nonzero entry appears, and start counting from 0 again. For example,

>>> a
array([[ 0,  0,  1,  0,  2],
       [ 0,  0,  0,  1,  1],
       [ 0,  1,  0,  0,  0],
       [ 0,  0, 10,  2,  2],
       [ 2,  0,  0,  0,  0]])

In this case, my desired output will be

array([[1, 1, 0, 1, 0],
       [2, 2, 1, 0, 0],
       [3, 0, 2, 1, 1],
       [4, 1, 0, 0, 0],
       [0, 2, 1, 1, 1]])

I tried making this with two for-loops, but it is very slow on a very large data set. I was hoping I could find a way to vectorize this operation so that the time is O(n) instead of O(n^2). Any help will be appreciated!

Devin Cairns · Accepted Answer

What about something like this, which is vectorized and has no for loops:

def moving_count(a, value, axis=0):
    """Return sequential counts of a given value along an axis"""
    if np.all(a == value):
        # Fill the output with counts along the desired axis
        return np.rollaxis(np.arange(1, a.size + 1).reshape(a.shape), axis)

    # Allocate output with a cumulative count of value along flattened axis
    output = np.cumsum(np.rollaxis(a, axis) == value)

    # Find locations of breakpoints
    breakpoints = (np.roll(output, 1) - output) == 0

    # Since breakpoints is boolean, argmax returns the location of the first breakpoint
    threshold = np.argmax(breakpoints)

    # Repeat the cumulative value along labels and subtract to reset the count at each breakpoint
    output[threshold:] -= output[breakpoints][np.cumsum(breakpoints) - 1][threshold:]

    # Reshape and return axis to match input array
    return np.rollaxis(output.reshape(a.shape), axis)

Applied to your problem:

In[3]: a = np.array([[ 0,  0,  1,  0,  2],
                     [ 0,  0,  0,  1,  1],
                     [ 0,  1,  0,  0,  0],
                     [ 0,  0, 10,  2,  2],
                     [ 2,  0,  0,  0,  0]])
In[4]: moving_count(a, 0, 1)
Out[4]: 

array([[1, 1, 0, 2, 0],
       [2, 2, 1, 0, 0],
       [3, 0, 2, 1, 1],
       [4, 1, 0, 0, 0],
       [0, 2, 1, 1, 1]])

Vectorized Counting in 2D array python

Answers (2)

Related Questions