MattC1990
MattC1990

Reputation: 57

Scanning for groups of the same value in numpy array

I have a numpy array where 0 denotes empty space and 1 denotes that a location is filled. I am trying to find a quick method of scanning the numpy array for where there are multiple values of zero adjacent to each other and return the location of the central zero.

For Example if I had the following array

[0 1 0 1]
[0 0 0 1]
[0 1 0 1]
[1 1 1 1]

I want to return the locations for which there is an adjacent zero on either side of a central zero

e.g

[1,1]

as this is the central of 3 zeros, i.e there is a zero either side of the zero at this location

Im aware that this can be calculated using if statements, but wondered if there was a more pythonic way of doing this.

Any help is greatly appreciated

Upvotes: 1

Views: 616

Answers (1)

jakevdp
jakevdp

Reputation: 86513

The desired output here for arbitrary inputs is not exhaustively specified in the question, but here is a possible approach that might be useful for this kind of problem, and adapted to the details of the desired output. It uses np.cumsum, np.bincount, np.where, and np.median to find the middle index for groups of consecutive zeros along rows of a 2D array:

import numpy as np

def find_groups(x, min_size=3, value=0):
  # Compute a sequential label for groups in each row. 
  xc = (x != value).cumsum(1)

  # Count the number of occurances per group in each row.
  counts = np.apply_along_axis(
      lambda x: np.bincount(x, minlength=1 + xc.max()),
      axis=1, arr=xc)

  # Filter by minimum number of occurances.
  i, j = np.where(counts >= min_size)

  # Compute the median index of each group.
  return [
    (ii, int(np.ceil(np.median(np.where(xc[ii] == jj)[0]))))
    for ii, jj in zip(i, j)
  ]

x = np.array([[0, 1, 0, 1],
              [0, 0, 0, 1],
              [0, 1, 0, 1],
              [1, 1, 1, 1]])

print(find_groups(x))
# [(1, 1)]

It should work properly even for multiple rows with groups of varying sizes, and even multiple groups per row:

x2 = np.array([[0, 1, 0, 1, 1, 1, 1],
               [0, 0, 0, 1, 0, 0, 0],
               [0, 1, 0, 0, 0, 0, 1],
               [0, 0, 0, 0, 0, 0, 0]])

print(find_groups(x2))
# [(1, 1), (1, 5), (2, 3), (3, 3)]

Upvotes: 1

Related Questions