Jeff Meatball Yang
Jeff Meatball Yang

Reputation: 39057

Most efficient way of creating tree from adjacency list

I have an adjacency list of objects (rows loaded from SQL database with the key and it's parent key) that I need to use to build an unordered tree. It's guaranteed to not have cycles.

This is taking wayyy too long (processed only ~3K out of 870K nodes in about 5 minutes). Running on my workstation Core 2 Duo with plenty of RAM.

Any ideas on how to make this faster?

public class StampHierarchy {
    private StampNode _root;
    private SortedList<int, StampNode> _keyNodeIndex;

    // takes a list of nodes and builds a tree
    // starting at _root
    private void BuildHierarchy(List<StampNode> nodes)
    {
        Stack<StampNode> processor = new Stack<StampNode>();
        _keyNodeIndex = new SortedList<int, StampNode>(nodes.Count);

        // find the root
        _root = nodes.Find(n => n.Parent == 0);

        // find children...
        processor.Push(_root);
        while (processor.Count != 0)
        {
            StampNode current = processor.Pop();

            // keep a direct link to the node via the key
            _keyNodeIndex.Add(current.Key, current);  

            // add children
            current.Children.AddRange(nodes.Where(n => n.Parent == current.Key));

            // queue the children
            foreach (StampNode child in current.Children)
            {
                processor.Push(child);
                nodes.Remove(child); // thought this might help the Where above
            }
        }
    }
}

    public class StampNode {
         // properties: int Key, int Parent, string Name, List<StampNode> Children
    }

Upvotes: 7

Views: 8552

Answers (2)

Ian Mercer
Ian Mercer

Reputation: 39297

  1. Put the nodes into a sorted list or dictionary.

  2. Scan that list, pick up each node, find its parent node in the same list (binary search or dictionary lookup), add it to the Children collection of the parent node.

There's no need for a Stack to put this into a tree.

Upvotes: 5

Gary Linscott
Gary Linscott

Reputation: 1471

SortedList is not a good container to use in this context. It is O(n) for insertion operations (the repeated calls to Add()), as it is internally represented as a flat list. Using Dictionary instead of SortedList will be a large improvement, as it is O(1) amortized insertion time.

Upvotes: 2

Related Questions