user2505650
user2505650

Reputation: 1381

How to count number of zeros at the left of each one in an Numpy array

I have a numpy binary array like this:

   Array A = [1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0]

I would like to count how many 0s are there at the left of each 1, and return it in an other array that would look like this for this for this example:

nb_1s = [0, 0, 1, 2, 2, 5]

There are no 0s at the left for the two first 1s so the the first two numbers of the array are 0 etc...

I know that first I have to initiate an array with number of 1s in my array:

def give_zeros(binary_array):
    binary_array = np.asarray(binary_array)
    nb_zeros = np.zeros(binary_array.sum())


    return nb_zeros

But I'm not sure on how to count the number of zeros. Should I iterate in a for loop with 'nditer'? It doesn't seem efficient as i will have to run this function on very large arrays.

Do you have any ideas? Thank you.

Upvotes: 4

Views: 1665

Answers (4)

Divakar
Divakar

Reputation: 221564

Here's a vectorized way with differentiation of range array from the indices of 1s -

def leftzeros_count(a):
    idx = np.flatnonzero(a!=0)
    return idx - np.arange(len(idx))

Sample runs -

In [298]: a = np.array([1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0])

In [299]: leftzeros_count(a)
Out[299]: array([0, 0, 1, 2, 2, 5])

In [300]: a = np.array([0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0])

In [301]: leftzeros_count(a)
Out[301]: array([1, 1, 2, 3, 3, 6])

In [302]: a = np.array([0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1])

In [303]: leftzeros_count(a)
Out[303]: array([ 1,  1,  2,  3,  3,  6, 10])

Runtime test

For the timings, let's tile the given sample a large number of times and time the vectorized approaches -

In [7]: a = np.array([1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0])

In [8]: a = np.tile(a,100000)

# @Eric Duminil's soln
In [9]: %timeit (a == 0).cumsum()[a > 0]
100 loops, best of 3: 10.9 ms per loop

# Proposed in this post
In [10]: %timeit leftzeros_count(a)
100 loops, best of 3: 3.71 ms per loop

Upvotes: 3

Sebastian Fleck
Sebastian Fleck

Reputation: 101

If the count is cumulative (as per your example) then you can do this easily in O(n). Simply have a counter that increases by one every time you find a zero and then append the value of the counter variable to another array for every one you hit in your initial array.

Upvotes: 1

Eric Duminil
Eric Duminil

Reputation: 54223

Code

You could use:

(A == 0).cumsum()[A > 0]
# array([0, 0, 1, 2, 2, 5])

or:

(~A).cumsum()[A]
# array([0, 0, 1, 2, 2, 5])

if A is a bool array.

Explanation

A == 0 is a boolean array which is True for each 0:

>>> import numpy as np
>>> A = np.array([1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0])
>>> A == 0
array([False, False,  True, False,  True, False, False,  True,  True,
        True, False,  True,  True,  True,  True], dtype=bool)

You can use cumsum() to count the number of Trues:

>>> (A == 0).cumsum()
array([0, 0, 1, 1, 2, 2, 2, 3, 4, 5, 5, 6, 7, 8, 9])

You only need the values where A > 0:

>>> (A == 0).cumsum()[A > 0]
array([0, 0, 1, 2, 2, 5])

Done!

Upvotes: 4

alvas
alvas

Reputation: 122052

In the non-vectorized manner:

>>> x = [1, 1, 0, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 0, 0]
>>> c, y = 0, []
>>> for i in x:
...     if i == 1:
...         y.append(c)
...     else:
...         c += 1
... 
>>> y
[0, 0, 1, 2, 2, 5]

For vectorized solution, see @Divakar's answer:

In numpy, first find the non-zero indices, with np.nonzero():

>>> np.nonzero(x)[0]
array([ 0,  1,  3,  5,  6, 10])

Then subtract that with the range array of length of indices:

>>> idx = np.nonzero(x)[0]
>>> np.arange(len(idx))
array([0, 1, 2, 3, 4, 5])
>>> np.nonzero(x)[0] - np.arange(len(idx))
array([0, 0, 1, 2, 2, 5])

>>> np.arange(x.count(1))
array([0, 1, 2, 3, 4, 5])
>>> np.nonzero(x)[0] - np.arange(x.count(1))
array([0, 0, 1, 2, 2, 5])

Upvotes: 2

Related Questions