NicolaiF
NicolaiF

Reputation: 1343

Identifying largest connected component in a matrix

I have a python numpy matrix with 1´s and 0´s, I need to identify the largest "collection" of 1's in the matrix: https://i.sstatic.net/DbxOq.jpg

The matrix can have up to 960.000 elements so I would like to avoid a brute-force solution.

What is the smartest way to go about solving this problem?

Upvotes: 1

Views: 2062

Answers (2)

Ken
Ken

Reputation: 508

I think the count will be off in the provided answer. E.g. if rows is changed to rows = [[1, 0, 0], [1, 1, 1], [1, 0, 0]] still getting 4, though it should be 5. Changing Union to

def Union(x, y):
  xRoot = Find(x)
  yRoot = Find(y)
  if xRoot.rank > yRoot.rank:
    yRoot.parent = xRoot
    xRoot.size += yRoot.size
  elif xRoot.rank < yRoot.rank:
    xRoot.parent = yRoot
    yRoot.size += xRoot.size
  elif xRoot != yRoot:  # Unless x and y are already in same set, merge them
    yRoot.parent = xRoot
    xRoot.rank = xRoot.rank + 1
    xRoot.size += yRoot.size

seems to fix.

Upvotes: 2

Benjy Kessler
Benjy Kessler

Reputation: 7656

You can use a data structure called disjoint-set (here is a python implementation). This data structure was designed for this kind of task.

You iterate over the rows if the current element is 1, check if any of the already traversed neighbors are 1. If so add this element to its set. If there are more than 1 union those sets. If no neighbors are 1 create a new set. At the end output the largest set.

This would work as follows:

def MakeSet(x):
  x.parent = x
  x.rank   = 0
  x.size = 1

def Union(x, y):
  xRoot = Find(x)
  yRoot = Find(y)
  if xRoot.rank > yRoot.rank:
    yRoot.parent = xRoot
  elif xRoot.rank < yRoot.rank:
    xRoot.parent = yRoot
  elif xRoot != yRoot: # Unless x and y are already in same set, merge them
    yRoot.parent = xRoot
    xRoot.rank = xRoot.rank + 1
  x.size += y.size
  y.size = x.size

def Find(x):
  if x.parent == x:
    return x
  else:
    x.parent = Find(x.parent)
    return x.parent

""""""""""""""""""""""""""""""""""""""""""

class Node:
  def __init__ (self, label):
    self.label = label
  def __str__(self):
    return self.label

rows = [[1, 0, 0], [1, 1, 0], [1, 0, 0]]
setDict = {}
for i, row in enumerate(rows):
  for j, val in enumerate(row):
    if row[j] == 0:
      continue
    node = Node((i, j))
    MakeSet(node)
    if i > 0:
      if rows[i-1][j] == 1:
        disjointSet = setDict[(i-1, j)]
        Union(disjointSet, node)
    if j > 0:
      if row[j-1] == 1:
      disjointSet = setDict[(i, j-1)]
      Union(disjointSet, node)
    setDict[(i, j)] = node
print max([l.size for l in setDict.values()])

>> 4

This is a full working example with code for disjoint set taken from the link above.

Upvotes: 3

Related Questions