Reputation: 179
I have a large DiGraph
that consists of dependent jobs. For example, for the graph a>b>c, job c can run only once job b is completed. I tried to find a function that gets all c's upstream jobs (that is (a, b)). I used DiGraph.predecessors
, but it only returns job b.
Is there a function that will list all c's upstream jobs?
How can I draw the dependency diagram for a leaf node (like job c)?
I have read the documentation but couldn't find the answer.
Upvotes: 13
Views: 8585
Reputation: 10167
I had a similar problem for a recent project, where jobs were dependent on other jobs. I found the most reliable way to guarantee processing them in the correct order was to use the DiGraph method topological_sort
.
Below is a minimal example, starting with the creation of the DiGraph:
import networkx as nx
import matplotlib.pyplot as plt
from networkx.drawing.nx_pydot import graphviz_layout
# this function is optional
# just moves the node labels on the plot so they don't overlap the lines
def nudge(pos, x_shift, y_shift):
return {n:(x + x_shift, y + y_shift) for n,(x,y) in pos.items()}
# start with a directed graph
dag = nx.DiGraph()
dag.add_edges_from(
[
('root', 'a'),
('a', 'b'),
('a', 'c'),
('b', 'd'),
('d', 'e'),
('d', 'f')
]
)
# plot the graph
pos = graphviz_layout(dag, prog="dot")
pos_labels = nudge(pos, -5, 15)
fig = plt.figure(figsize=(8, 6))
ax = plt.subplot(1, 1, 1)
plt.margins(0.1)
nx.draw(
dag,
pos=pos,
ax=ax,
with_labels=False,
arrows=True,
node_size=100,
arrowsize=20
)
labels = nx.draw_networkx_labels(
dag,
pos=pos_labels,
ax=ax,
font_size=18,
font_weight='bold',
font_color='orange',
verticalalignment='bottom'
)
The output plot of the graph looks like:
Probably the best thing to do first is verify the graph is a directed acyclic graph:
nx.is_directed_acyclic_graph(dag)
That will return True or False. Next, we can use topological_sort
to retrieve the nodes in dependency order:
list(nx.algorithms.topological_sort(dag))
This returns:
['root', 'a', 'b', 'c', 'd', 'e', 'f']
If you would instead like to have the edge pairs instead of just the nodes:
list(nx.topological_sort(nx.line_graph(dag)))
which returns:
[('root', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'd'), ('d', 'e'), ('d', 'f')]
Finally, if the jobs need to be in reverse order you can use reverse
:
dag_reverse = dag.reverse()
Re-running the commands::
list(nx.algorithms.topological_sort(dag_reverse))
gives:
['c', 'e', 'f', 'd', 'b', 'a', 'root']
and:
list(nx.topological_sort(nx.line_graph(dag_reverse)))
gives:
[('c', 'a'), ('e', 'd'), ('f', 'd'), ('d', 'b'), ('b', 'a'), ('a', 'root')]
Upvotes: 0
Reputation: 10020
It may seem strange, but this function is called ancestors :)
nx.ancestors(G, your_node)
Upvotes: 12
Reputation: 1845
Using predecessors will only return the nodes with a direct edge to the input node. Finding all node's ancestors can be done as follows:
import networkx as nx
import matplotlib.pyplot as plt
G = nx.DiGraph()
# G is:
# e-->f
# ^
# |
# a--->b-->c-->d
#
G.add_edges_from([('a', 'b'),('b', 'c'),('c', 'd'), ('b', 'e'), ('e', 'f')])
T = nx.dfs_tree(G.reverse(), source='f').reverse()
# T is: a-->b-->e-->f
pos = nx.nx_pydot.pydot_layout(T, prog='dot')
nx.draw_networkx(T, pos=pos, arrows= True, with_labels=True)
plt.show()
What we do is simply running DFS from the input node on the reversed directed graph, and then reverse the result again to get the edges in their original direction.
The last three lines are for drawing the result.
Upvotes: 7