Reputation: 169
I'm coding in python.
I have a list of sets. These sets contain integers. If two sets share an integer item, it is "connected". My goal is to determine if all of these sets are all mutually connected into a single group (as opposed to no connected sets or multiple groups of mutually connected sets).
Is there a common algorithm for this? It seems like a widely applicable goal.
This is my proposed solution:
start with first set and check if contents are shared with any other set
delete any set with shared content and add other contents to first set
repeat until no change to first set
if all other sets have been deleted, then they are all connected
clarification
I want to distinguish one mutually connected chain of sets
o--o--o--o--o--o
from separate groups of mutually connected sets
o--o--o o--o--o
So, simply checking if each set is connected to another set is not enough.
Upvotes: 1
Views: 51
Reputation: 1424
If the possible integers are known and their number has a small upper bound like 32 you can represent each set as vector of bits and apply bitwise and like this: x(n) = x(n-1) & s(n)
with s(n)
being the n-th set and &
being bitwise and. If all bits of x(n)
are zero for any n
you will know that there are multiple groups of sets. The time complexity of this approach is linear and uses operations that current hardware can execute very efficiently.
This and any other solution can be extended by the following check. It has to be applied before the original solution. The idea is to be finished quickly in some cases. The check requires that the minimal and and maximal integer of each set are known. If the greatest of all these minimum integers is greater than the smallest of all maximal integers you will know that there are multiple groups of sets. So, in this case you can finish. If the condition is not true you will have to continue with the original solution.
Upvotes: 0
Reputation: 178481
Your solution is correct, and is a variant of DFS (though since you manipulate the sets it might be a bit inefficient)
Your problem is basically a graph problem, where the graph is:
G = (V,E)
V = { sets } = {S1, S2, ..., Sn}
E = { (Si,Sj) | Si and Sj share an integer }
This graph is undirected by nature, and your problem is finding if it is connected or not. This can be done by BFS or DFS. Just start from one arbitrary vertex, until you are "stuck" (without restarting from a new source). If when it happens, you have "discovered" all sets, the graph is connected. Otherwise, it is not.
Run time is O(|V|+|E|)
, where |V|
is the number of sets you have, and |E|
is the number of connections.
Note: The set E
can be calculate efficiently for sparsed graphs by creating an inverted index. For each number, create a list of all the sets that contain this number (this is liner in the size of the input), and then generate the edges by going through all pairs in the list (for sparsed graphs this should be fairly small).
Though for dense graph, a more efficient way to generate it will probably just be to go through all pairs of sets.
Upvotes: 3
Reputation: 2584
Here's what I would try:
compare each of the set members:
if there is a common Integer then leave one set and continue with another, continue with this until there are no more sets.
If there isn't any members shared than break and output false.
Upvotes: 0