Reputation: 2383
I have a data set that is a special case of a directed acyclic graph (DAG). The nodes in my DAG have either 0 or 1 arcs. All arcs are equally weighted (that is, the only information contained in an arc is the node it points at, no "distance" or "cost" or "weight").
My users will input nodes in a semi-random order, with the expectation that the order of arcless nodes be preserved, but all arcs get sorted before the nodes that point at them (so child-first ordering).
Here is a simplified class representing my data:
class Node:
def __init__(self, name, arc=None):
self.name = str(name)
self.arc = str(arc)
(note that self.arc
is a string representation of the pointed-at node and not the object itself)
So, given an input like this:
input = [Node('A'),
Node('B'),
Node('C'),
Node('Z', 'Y'),
Node('X', 'W'),
Node('Y', 'X'),
Node('W')]
You would get output like this, preferably using the fewest loops and intermediate data structures:
output = [Node('A'),
Node('B'),
Node('C'),
Node('W'),
Node('X', 'W'),
Node('Y', 'X'),
Node('Z', 'Y')]
Upvotes: 0
Views: 662
Reputation: 2383
So far this is the best algorithm I've been able to come up with:
def topo_sort(nodes):
objects = {} # maps node names back to node objects
graph = OrderedDict() # maps node names to arc names
output = []
for node in nodes:
objects[node.name] = node
graph[node.name] = node.arc
arcs = graph.values()
# optional
missing = set(arcs) - set(graph) - {None}
if missing:
print('Invalid arcs: {}'.format(', '.join(missing)))
def walk(arc):
"""Recurse down the tree from root to leaf nodes."""
obj = objects.get(arc)
if obj and obj not in output:
follow_the_arcs(graph.get(arc))
output.append(obj)
# Find all root nodes
for node in graph:
if node not in arcs:
walk(node)
return ordered
What I like about this is that there are only 2 loops (one to build the graph and one to find the roots), and the output is built exclusively with list.append()
, no slow list.insert()
s. Can anybody think of any improvements to this? Any obvious inefficiencies I've overlooked?
Thanks!
Upvotes: 0