how does kruskal's algorithm's performance gets affected by the disjoint set data structure?

I've got a basic idea on what Kruskal's algorithm is and this is what I discovered:

This algorithm basically constructs a minimal spanning tree by merging multiple trees and it begins by sorting the edges by their weights. Beginning with an empty sub graph, the algorithm scans the list of edges adding the next edge to the sub graph if it does not create a cycle.

Where as disjoint set is a data structure, which actually has few ways to derive the minimal spanning tree by using a linked-list or a forest-tree method.

What I want to know is, how disjoint set affects the performance of the Kruskal's algorithm? Any help could be appreciated.

Upvotes: 1

Answers (2)

arunk2

Reputation: 2426

Data structure 'disjoint set' is important to ensure O(E*logE) overall complexity.

Let us consider the standard kruskal algo.

KRUSKAL(G):

A = ∅
foreach v ∈ G.V:
    MAKE-SET(v)
foreach (u, v) in G.E ordered by weight(u, v), increasing:
   if FIND-SET(u) ≠ FIND-SET(v):
       A = A ∪ {(u, v)}
       UNION(u, v)
return A

The FIND-SET() and UNION() methods identify the sets(or forests) and join them. Both these operations can be done in O(1) with 'disjoint set' data structure. The overall complexity for the FIND and UNION part is O(E).

put together we have O(V) + O(E * logE) + O(E) = O(E * logE)

Hence, data structure 'disjoint set' is important to ensure O(E * logE) overall complexity.

Upvotes: 1

DAle

Reputation: 9127

Kruskal's algorithm sorts edges at first. This can be done in O(E*log(E)) = O(E*log(V^2)) = O(E*2*log(V)) = O(E*log(V)) time.

Then iterates through the edges and executes O(E) disjoint-set data structure operations (2 find & possibly 1 union on every iteration). The complexity of the operations depends on the disjoint-set implementation. Naive implementation has the O(V) union operation and O(1) find operation. This leads to O(E + V^2) time because union operation would be executed at most V times, but even with disjoint-sets forest with union by rank the complexity of both operations is O(log(V)) (it can be O(α(V)) with the addition of path compression).

Thus, Kruskal's algorithm with a naive disjoint-set data structure implementation:
O(E*log(V)) + O(E + V^2) = O(E*log(V) + V^2) (in sparse enough graphs second term would dominate)

implementation that has at least union by rank:
O(E*log(V)) + O(E*log(V)) = O(E*log(V))

Upvotes: 1

how does kruskal&#39;s algorithm&#39;s performance gets affected by the disjoint set data structure?

Answers (2)

Related Questions

how does kruskal's algorithm's performance gets affected by the disjoint set data structure?