Reputation: 1343
I have a python numpy matrix with 1´s and 0´s, I need to identify the largest "collection" of 1's in the matrix: https://i.sstatic.net/DbxOq.jpg
The matrix can have up to 960.000 elements so I would like to avoid a brute-force solution.
What is the smartest way to go about solving this problem?
Upvotes: 1
Views: 2062
Reputation: 508
I think the count will be off in the provided answer. E.g. if rows is changed to rows = [[1, 0, 0], [1, 1, 1], [1, 0, 0]]
still getting 4, though it should be 5. Changing Union to
def Union(x, y):
xRoot = Find(x)
yRoot = Find(y)
if xRoot.rank > yRoot.rank:
yRoot.parent = xRoot
xRoot.size += yRoot.size
elif xRoot.rank < yRoot.rank:
xRoot.parent = yRoot
yRoot.size += xRoot.size
elif xRoot != yRoot: # Unless x and y are already in same set, merge them
yRoot.parent = xRoot
xRoot.rank = xRoot.rank + 1
xRoot.size += yRoot.size
seems to fix.
Upvotes: 2
Reputation: 7656
You can use a data structure called disjoint-set (here is a python implementation). This data structure was designed for this kind of task.
You iterate over the rows if the current element is 1, check if any of the already traversed neighbors are 1. If so add this element to its set. If there are more than 1 union those sets. If no neighbors are 1 create a new set. At the end output the largest set.
This would work as follows:
def MakeSet(x):
x.parent = x
x.rank = 0
x.size = 1
def Union(x, y):
xRoot = Find(x)
yRoot = Find(y)
if xRoot.rank > yRoot.rank:
yRoot.parent = xRoot
elif xRoot.rank < yRoot.rank:
xRoot.parent = yRoot
elif xRoot != yRoot: # Unless x and y are already in same set, merge them
yRoot.parent = xRoot
xRoot.rank = xRoot.rank + 1
x.size += y.size
y.size = x.size
def Find(x):
if x.parent == x:
return x
else:
x.parent = Find(x.parent)
return x.parent
""""""""""""""""""""""""""""""""""""""""""
class Node:
def __init__ (self, label):
self.label = label
def __str__(self):
return self.label
rows = [[1, 0, 0], [1, 1, 0], [1, 0, 0]]
setDict = {}
for i, row in enumerate(rows):
for j, val in enumerate(row):
if row[j] == 0:
continue
node = Node((i, j))
MakeSet(node)
if i > 0:
if rows[i-1][j] == 1:
disjointSet = setDict[(i-1, j)]
Union(disjointSet, node)
if j > 0:
if row[j-1] == 1:
disjointSet = setDict[(i, j-1)]
Union(disjointSet, node)
setDict[(i, j)] = node
print max([l.size for l in setDict.values()])
>> 4
This is a full working example with code for disjoint set taken from the link above.
Upvotes: 3