Algorithm to find MST in a huge complete graph

Let's assume a complete graph of > 25000 nodes. Each node is essentially a point on a plane. It has 625M edges. Each edge has length which should be stored as a floating point number.

I need an algorithm to find its MST (on a usual PC).

If I take Kruskal's algorithm, it needs to sort all edges first, but I cannot afford even store the edges altogether in memory at the same time.

If I choose Prim's algorithm, it's quite difficult to evaluate how many edges will be stored in a heap at the same time, but probably the most of them will be there very soon after algorithm starts.

Is there any more memory-sufficient algorithm which can allow me to avoid sorting edges stored in a file?

Also, are there any known MST algorithms which utilize the fact that any tree edges satisfy triangle inequality?

Upvotes: 9

Answers (3)

David Eisenstat

Reputation: 65458

Boruvka's algorithm makes a logarithmic number of passes on the unsorted edge list. The memory required is proportional to the number of nodes.

Upvotes: 3

Nuclearman

Reputation: 5304

You can still use Kruskal's algorithm.

You don't actually need to sort the edges, what the algorithm requires is simply a method for repeatably finding the smallest weight edge that hasn't already been used. Presorting the edges and iterating through that list is simply a very efficient way of doing so.

You can do the same thing simply by repeatably find the k-smallest unused edges (where k is a manageable number, probably at least |V|), then sort and iterate through them instead as needed. This breaks the sorting process down into more manageable segments, although there is a time-space tradeoff as depending on how large k is the time complexity of this process can be anywhere from O(E log E) (k = E) to about O(E^2) (k = 1).

Upvotes: 6

Jayram

Reputation: 19578

Try to use this algorithm

1: Append weight w and outgoing vertex v per edge into a list, X.
2: Divide the edge-list, E, into segments with 1 indicating the start
of each segment, and 0 otherwise, store this in flag array F.
3: Perform segmented min scan on X with F indicating segments
to find minimum outgoing edge-index per vertex, store in NWE.
4: Find the successor of each vertex and add to successor array, S.
5: Remove cycle making edges from NWE using S, and identify
representatives vertices.
6: Mark remaining edges from NWE as part of output in MST.
7: Propagate representative vertex ids using pointer doubling.
8: Append successor array’s entries with its index to form a list, L
9: Split L, create flag over split output and scan the flag to find
new ids per vertex, store new ids in C.
10: Find supervertex ids of u and v for each edge using C.
11: Remove edge from edge-list if u, v have same supervertex id.
12: Remove duplicate edges using split over new u, v and w.
13: Compact and create the new edge-list and weight list .
14: Build the vertex list from the newly formed edge-list.
15: Call the MST Algorithm on

Author:

Vibhav Vineet    
Pawan Harish    
Suryakant Patidar    
P. J. Narayanan

Source

Upvotes: 0

Algorithm to find MST in a huge complete graph

Answers (3)

Related Questions