Identifying largest connected component in a matrix

Question

I have a python numpy matrix with 1´s and 0´s, I need to identify the largest "collection" of 1's in the matrix: https://i.sstatic.net/DbxOq.jpg

The matrix can have up to 960.000 elements so I would like to avoid a brute-force solution.

What is the smartest way to go about solving this problem?

Benjy Kessler · Accepted Answer

You can use a data structure called disjoint-set (here is a python implementation). This data structure was designed for this kind of task.

You iterate over the rows if the current element is 1, check if any of the already traversed neighbors are 1. If so add this element to its set. If there are more than 1 union those sets. If no neighbors are 1 create a new set. At the end output the largest set.

This would work as follows:

def MakeSet(x):
  x.parent = x
  x.rank   = 0
  x.size = 1

def Union(x, y):
  xRoot = Find(x)
  yRoot = Find(y)
  if xRoot.rank > yRoot.rank:
    yRoot.parent = xRoot
  elif xRoot.rank < yRoot.rank:
    xRoot.parent = yRoot
  elif xRoot != yRoot: # Unless x and y are already in same set, merge them
    yRoot.parent = xRoot
    xRoot.rank = xRoot.rank + 1
  x.size += y.size
  y.size = x.size

def Find(x):
  if x.parent == x:
    return x
  else:
    x.parent = Find(x.parent)
    return x.parent

""""""""""""""""""""""""""""""""""""""""""

class Node:
  def __init__ (self, label):
    self.label = label
  def __str__(self):
    return self.label

rows = [[1, 0, 0], [1, 1, 0], [1, 0, 0]]
setDict = {}
for i, row in enumerate(rows):
  for j, val in enumerate(row):
    if row[j] == 0:
      continue
    node = Node((i, j))
    MakeSet(node)
    if i > 0:
      if rows[i-1][j] == 1:
        disjointSet = setDict[(i-1, j)]
        Union(disjointSet, node)
    if j > 0:
      if row[j-1] == 1:
      disjointSet = setDict[(i, j-1)]
      Union(disjointSet, node)
    setDict[(i, j)] = node
print max([l.size for l in setDict.values()])

>> 4

This is a full working example with code for disjoint set taken from the link above.

Identifying largest connected component in a matrix

Answers (2)

Related Questions