Someone
Someone

Reputation: 643

Do the order of edges matter in union find?

I am learning and to understand it better, I have written a small program:

#include <iostream>
#include <vector>
#include <numeric>
using namespace std;
 
vector<int> parent, sz;
 
int find(int i) {
    if(parent[i]==i) return i;
    return parent[i]=find(parent[i]);
}
 
void merge(int i, int j) {
    int p1=find(i);
    int p2=find(j);
 
    if(p1==p2) return;
    if(sz[p1]<sz[p2]) {
        parent[p1]=p2;
        sz[p2]+=sz[p1];
    } else {
        parent[p2]=p1;
        sz[p1]+=sz[p2];
    }
}
 
int main() {
    vector<vector<int>> allowedSwaps={{0,4},{4,2},{1,3},{1,4}};
 
    int n=5;    //hard-code for now
    sz.resize(n,1);
    parent.resize(n);
    iota(begin(parent),end(parent),0);
 
    cout<<"Parents before: \n";
    for(auto& e: parent)
        cout<<e<<" ";
    cout<<"\n";
 
    for(vector<int>& currswap: allowedSwaps) {
        merge(currswap[0],currswap[1]);
    }
 
    cout<<"Parents after: \n";
    for(auto& e: parent)
        cout<<e<<" ";
    cout<<"\n";
 
    return 0;
}

allowedSwaps essentially denotes all the components that are connected to each other. For the code above, all of them are connected.

However, if you notice the output:

Parents before: 
0 1 2 3 4 
Parents after: 
0 0 0 1 0 

It shows that one of them (3, 0-indexed) is not connected. I think that is because when we process the edge {1,3}, both the vertices 1 and 3 have not been connected to the larger component yet (they are later, by {1,4}) and so Union Find determines that it is a different component.

Does this mean that the order of edges matter for Union find? Or, am I missing something?

Upvotes: 3

Views: 274

Answers (3)

biqarboy
biqarboy

Reputation: 852

You are dealing issue with path compression. After merging all the connected components, just call your find() function for every element. Please consider adding the following lines at the end of your program:

for(int i=0; i<5; i+=1) find(i);

cout<<"Parents after path compression: \n";
for(auto& e: parent)
    cout<<e<<" ";
cout<<"\n";

And you will get the expected result like this:

Parents before: 
0 1 2 3 4 
Parents after: 
0 0 0 1 0 
Parents after path compression: 
0 0 0 0 0 

Upvotes: 0

templatetypedef
templatetypedef

Reputation: 372892

I was with you in your question up until this point:

Parents after:
0 0 0 1 0

It shows that one of them (3, 0-indexed) is not connected.

Actually, item 3 is indeed linked to all the other items. Think about what would happen if you called find(3) here. Since 3's parent is 1, and 1's parent is 0, we'd the call to find(3) would return 0, the same cluster that contains everything else.

I think what might be tripping you up here is that two elements can be in the same cluster even if their immediate parents aren't the same. That's normal, and it's why the find function traces through the parents until it finds an item that's its own parent.

Upvotes: 1

trincot
trincot

Reputation: 350375

The output you got does not signify that one node is disconnected.

The parent data structure represents links from one node to another (or itself, when it is a root).

At the start you have this:

enter image description here

And at the end we have this:

enter image description here

The important thing here is that there is one tree. It is not required that all nodes are linked directly to the root, just that they have a path to the root, which is also true for the node with index 3.

NB: if you would call find(3), then also index 3 would receive value 0. It is just something that gets optimised by calling find, but it doesn't change the meaning of result.

Upvotes: 7

Related Questions