Anatoly
Anatoly

Reputation: 99

How to segment a matrix by neighbouring values?

Suppose I have a matrix like this:

     m = [0, 1, 1, 0,
          1, 1, 0, 0,
          0, 0, 0, 1]

And I need to get the coordinates of the same neighbouring values (but not diagonally):


like this

So the result would be a list of lists of coordinates in the "matrix" list, starting with [0,0], like this:

r = [[[0,0]],   
     [[0,1], [0,2], [1,0], [1,1]],   
     [[0,3], [1,2], [1,3], [2,0], [2,1], [2,2]]   
     [[2,3]]]    

There must be a way to do that, but I'm really stuck.

Upvotes: 2

Views: 832

Answers (3)

warped
warped

Reputation: 9481

tl;dr: We take an array of zeros and ones and use scipy.ndimage.label to convert it to an array of zeros and [1,2,3,...]. We then use np.where to find the coordinates of each element with value > 0. Elements that have the same value end up in the same list.

enter image description here

scipy.ndimage.label interprets non-zero elements of a matrix as features and labels them. Each unique feature in the input gets assigned a unique label. Features are e.g. groups of adjacent elements (or pixels) with the same value.

import numpy as np
from scipy.ndimage import label

# make dummy data
arr = np.array([[0,1,1,0], [1,1,0,0], [0,0,0,1]])

#initialise list of features
r = []

Since OP wanted all features, that is groups of zero and non-zero pixels, we use label twice: First on the original array, and second on 1 - original array. (For an array of zeros and ones, 1 - array just flips the values).

Now, label returns a tuple, containing the labelled array (which we are interested in) and the number of features that it found in that array (which we could use, but when I coded this, I chose to ignore it. So, we are interested in the first element of the tuple returned by label, which we access with [0]:

a = label(arr)[0]
b = label(1-arr)[0]

Now we check which unique pixel values label has assigned. So we want the set of a and b, repectively. In order for set() to work, we need to linearise both arrays, which we do with .ravel(). We have to subtract {0} in both cases, because for both a and b we are interested in only the non-zero values.

So, having found the unique labels, we loop through these values, and use np.where to find where on the array a given value is located. np.where returns a tuple of arrays. The first element of this tuple are all the row-coordinates for which the condition was met, and the second element are the column-coordinates. So, we can use zip(* to unpack the two containers of length n to n containers of length 2. This means that we go from list of all row-coords + list of all column-coords to list of all row-column-coordinate pairs for which the condition is met. Finally in python 3, zip is a generator, which we can evaluate by calling list() on it. The resulting list is then appended to our list of coordinates, r.

for x in set(a.ravel())-{0}:
    r.append(list(zip(*np.where(a==x))))
for x in set(b.ravel())-{0}:
    r.append(list(zip(*np.where(b==x))))

print(r)

[[(0, 1), (0, 2), (1, 0), (1, 1)],
 [(2, 3)],
 [(0, 0)],
 [(0, 3), (1, 2), (1, 3), (2, 0), (2, 1), (2, 2)]]

That said, we can speed up this code slightly by making use of the fact that label returns the number of features it assigned. This allows us to avoid the set command, which can take time on large arrays:

a, num_a = label(arr)

for x in range(1, num_a+1): # range from 1 to the highest label
    r.append(list(zip(*np.where(a==x))))

Upvotes: 5

Josh Sharkey
Josh Sharkey

Reputation: 1038

Returns as a list of groups under value key.

import numpy as np
import math


def get_keys(old_dict):
    new_dict = {}
    for key, value in old_dict.items():
        if value not in new_dict.keys():
            new_dict[value] = []
            new_dict[value].append(key)
        else:
            new_dict[value].append(key)
    return new_dict

def is_neighbor(a,b):
    if a==b:
        return True
    else:
        distance = abs(a[0]-b[0]), abs(a[1]-b[1])
        return distance == (0,1) or distance == (1,0)

def collate(arr):
    arr2 = arr.copy()
    ret = []
    for a in arr:
        for i, b in enumerate(arr2):
            if set(a).intersection(set(b)):
                a = list(set(a+b))
        ret.append(a)
    for clist in ret:
        clist.sort()

    return [list(y) for y in set([tuple(x) for x in ret])]


def get_groups(d):
    for k,v in d.items():
        ret = []
        for point in v:
            matches = [a for a in v if is_neighbor(point, a)]
            ret.append(matches)
        d[k] = collate(ret)

    return d


a = np.array([[0,1,1,0],
              [1,1,0,0],
              [0,0,1,1]])

d = dict(np.ndenumerate(a))
d = get_keys(d)
d = get_groups(d)

print(d)

Result:

{
  0: [[(0, 3), (1, 2), (1, 3)], [(0, 0)], [(2, 0), (2, 1)]], 
  1: [[(2, 2), (2, 3)], [(0, 1), (0, 2), (1, 0), (1, 1)]]
}

Upvotes: 1

Andrej Kesely
Andrej Kesely

Reputation: 195478

A solution with only standard libraries:

from pprint import pprint

m = [0, 1, 1, 0,
     1, 1, 0, 0,
     0, 0, 0, 1]

def is_neighbour(x1, y1, x2, y2):
    return (x1 in (x2-1, x2+1) and y1 == y2) or \
           (x1 == x2 and y1 in (y2+1, y2-1))

def is_value_touching_group(val, groups, x, y):
    for d in groups:
        if d['color'] == val and any(is_neighbour(x, y, *cell) for cell in d['cells']):
            return d

def check(m, w, h):
    groups = []

    for i in range(h):
        for j in range(w):
            val = m[i*w + j]
            touching_group = is_value_touching_group(val, groups, i, j)
            if touching_group:
                touching_group['cells'].append( (i, j) )
            else:
                groups.append({'color':val, 'cells':[(i, j)]})

    final_groups = []
    while groups:
        current_group = groups.pop()
        for c in current_group['cells']:
            touching_group = is_value_touching_group(current_group['color'], groups, *c)
            if touching_group:
                touching_group['cells'].extend(current_group['cells'])
                break
        else:
            final_groups.append(current_group['cells'])

    return final_groups

pprint( check(m, 4, 3) )

Prints:

[[(2, 3)],
 [(0, 3), (1, 3), (1, 2), (2, 2), (2, 0), (2, 1)],
 [(0, 1), (0, 2), (1, 1), (1, 0)],
 [(0, 0)]]

Upvotes: 4

Related Questions