Reputation: 553
I have this kind of Directed Acyclic Graph with multiple roots:
And I need to get a list with nodes sorted by directions and grouped by steps, like this:
ordering = [
[1, 3],
[2],
[4],
[5, 6],
[7]
]
Maybe there is some ready algorithm for this? I tried networkx.algorithm
but all of them can return me only a flat list without grouping by steps.
Upvotes: 6
Views: 3110
Reputation: 276
Try out NetworkX's topological generations
DG = nx.DiGraph([(1,2), (2,4), (3,4), (4,5), (4,6), (6,7)])
[sorted(generation) for generation in nx.topological_generations(DG)]
[[1, 3], [2], [4], [5, 6], [7]]
Upvotes: 8
Reputation: 2079
topological sort using DFS will solve the problem
from collections import defaultdict, deque
class Graph:
def __init__(self, directed=False, nodes=None, edges=None):
self.graph = defaultdict(list)
self.directed = directed
self.add_nodes(nodes)
self.add_edges(edges)
@property
def nodes(self):
if not self.directed:
return list(self.graph.keys())
elif self.directed:
nodes = set()
nodes.update(self.graph.keys())
for node in self.graph.keys():
for neighbor in self.graph[node]:
nodes.add(neighbor)
return list(nodes)
def add_node(self, node):
if node not in self.nodes:
self.graph[node] = list()
def add_nodes(self, nodes):
if nodes is None:
return None
for node in nodes:
self.add_node(node)
def remove_node(self, node):
try:
del self.graph[node]
except KeyError:
print(f'{node} is not in graph')
return None
# remove parallel edges, but keep other parallel edges untouched
for source, adj_list in self.graph.items():
empty = True
while empty:
if node in adj_list:
adj_list.remove(node)
else:
empty = False
def remove_nodes(self, nodes):
for node in nodes:
self.remove_node(node)
@property
def edges(self):
edges = list()
for source, neighbors in self.graph.items():
for neighbor in neighbors:
edges.append((source, neighbor))
return edges
def add_edge(self, edge):
node1, node2 = edge
self.graph[node1].append(node2)
if not self.directed:
self.graph[node2].append(node1)
def add_edges(self, edges):
if edges is None:
return None
for edge in edges:
self.add_edge(edge)
def remove_edge(self, edge):
self.remove_nodes(edge)
def remove_edges(self, edges):
for edge in edges:
self.remove_nodes(edge)
def topological_util(self, node, visited, label):
visited[node] = True
for edge in self.graph[node]:
if not visited[edge]:
self.topological_util(edge, visited, label)
label.appendleft(node)
def topological_sort(self):
visited = dict.fromkeys(self.nodes, False)
# store all nodes in topological order, the index is the position
label = deque()
for node in self.nodes:
if not visited[node]:
self.topological_util(node, visited, label)
return label
The class Graph and topological sort implemented in python. Hope this will help you.
Upvotes: 2
Reputation: 18221
nx.topological_sort
almost does what you want; the only difference is that it doesn't group items that enter the queue at the same time, but it's straightforward to adapt the source to make it do so:
def topological_sort_grouped(G):
indegree_map = {v: d for v, d in G.in_degree() if d > 0}
zero_indegree = [v for v, d in G.in_degree() if d == 0]
while zero_indegree:
yield zero_indegree
new_zero_indegree = []
for v in zero_indegree:
for _, child in G.edges(v):
indegree_map[child] -= 1
if not indegree_map[child]:
new_zero_indegree.append(child)
zero_indegree = new_zero_indegree
With your example:
In [21]: list(nx.topological_sort(G))
Out[21]: [3, 1, 2, 4, 6, 7, 5]
In [22]: list(topological_sort_grouped(G))
Out[22]: [[1, 3], [2], [4], [5, 6], [7]]
In practice, I have to wonder if there's a situation where this construction is more useful than just using nx.topological_sort
(or nx.lexicographical_topological_sort
) directly?
Upvotes: 8
Reputation: 7409
Your problem is solved by what is known as a "topological sort." Such a sort determines dependencies in a directed acyclic graph. I recently adapted a solution to this problem. Here is a simple python application that demonstrates its behavior:
# adapted from https://gist.github.com/kachayev/5910538
from collections import deque
GRAY, BLACK = 0, 1
def topological(graph):
order, enter, state = deque(), set(graph), {}
dot = "digraph X {\r\n"
for item in graph.keys():
dep = graph[item]
for d in dep:
dot += item + " -> " + str(d) + '\r\n'
dot += "}"
print(dot)
def dfs(node):
state[node] = GRAY
for k in graph.get(node, ()):
sk = state.get(k, None)
if sk == GRAY:
raise ValueError("cycle")
if sk == BLACK:
continue
enter.discard(k)
dfs(k)
#order.appendleft(node) # show highest to lowest
order.append(node) # show lowest to highest
state[node] = BLACK
while enter:
dfs(enter.pop())
return order
def main():
graph = {
'1': ['2'],
'2': ['4'],
'3': ['4'],
'4': ['5', '6'],
'6': ['7'],
}
try:
print(topological(graph))
except ValueError:
print("Cycle!")
if __name__ == "__main__":
main()
The output is
deque(['5', '7', '6', '4', '2', '1', '3'])
Please note also that my code creates a DOT 'digraph' string for visualization in GraphVis. You can safely leave that out once you gain confidence in the algorithm. You can reverse the commented and uncommented append
lines near the end to get the major nodes first. Also note, this solution determines a solution that satisfies the graph. There may be others, and it doesn't determine the order as you required, but the graph satisfaction is one correct answer.
Upvotes: 1
Reputation: 553
I wrote this code which solves my problem, but maybe there is some more elegant solution?
def process_cursor(G, passed, node_id):
if set(G.predecessors(node_id)).issubset(passed):
return True, list(G.successors(node_id))
return False, None
def get_all_roots(G: nx.DiGraph):
for node_id in G.nodes:
if not any(G.predecessors(node_id)):
yield node_id
def order_components(G: nx.DiGraph):
nodes_amount = len(G.nodes)
cursors = set(get_all_roots(G))
passed = []
iterations = 0
while len(passed) != nodes_amount:
iterations += 1
if iterations > nodes_amount:
raise ValueError("Could not create sequence of graph.")
step = []
next_cursors = []
step_passed = []
for node_id in cursors:
can_process, tmp_cursors = process_cursor(G, passed, node_id)
if can_process:
next_cursors.extend(tmp_cursors)
step_passed.append(node_id)
node_data = G.nodes[node_id]
step.append(node_id)
cursors = set(next_cursors)
passed.extend(step_passed)
if step:
yield step
yield append
Upvotes: 0