pythondependenciesnetworkxdirected-acyclic-graphs

Reputation: 189

How to find all node's ancestors in NetworkX?

I have a large DiGraph that consists of dependent jobs. For example, for the graph a>b>c, job c can run only once job b is completed. I tried to find a function that gets all c's upstream jobs (that is (a, b)). I used DiGraph.predecessors, but it only returns job b.

Is there a function that will list all c's upstream jobs?

How can I draw the dependency diagram for a leaf node (like job c)?

I have read the documentation but couldn't find the answer.

Upvotes: 14

Answers (3)

Fiver

Reputation: 10165

I had a similar problem for a recent project, where jobs were dependent on other jobs. I found the most reliable way to guarantee processing them in the correct order was to use the DiGraph method topological_sort.

Below is a minimal example, starting with the creation of the DiGraph:

import networkx as nx
import matplotlib.pyplot as plt
from networkx.drawing.nx_pydot import graphviz_layout

# this function is optional
# just moves the node labels on the plot so they don't overlap the lines 
def nudge(pos, x_shift, y_shift):
    return {n:(x + x_shift, y + y_shift) for n,(x,y) in pos.items()} 

# start with a directed graph
dag = nx.DiGraph()

dag.add_edges_from(
    [
        ('root', 'a'), 
        ('a', 'b'),
        ('a', 'c'),
        ('b', 'd'),
        ('d', 'e'),
        ('d', 'f')
    ]
)

# plot the graph
pos = graphviz_layout(dag, prog="dot")
pos_labels = nudge(pos, -5, 15)

fig = plt.figure(figsize=(8, 6))
ax = plt.subplot(1, 1, 1)
plt.margins(0.1)

nx.draw(
    dag, 
    pos=pos,
    ax=ax,
    with_labels=False,
    arrows=True,
    node_size=100, 
    arrowsize=20
)

labels = nx.draw_networkx_labels(
    dag,
    pos=pos_labels,
    ax=ax,
    font_size=18,
    font_weight='bold', 
    font_color='orange',
    verticalalignment='bottom'
)

The output plot of the graph looks like:

Probably the best thing to do first is verify the graph is a directed acyclic graph:

nx.is_directed_acyclic_graph(dag)

That will return True or False. Next, we can use topological_sort to retrieve the nodes in dependency order:

list(nx.algorithms.topological_sort(dag))

This returns:

['root', 'a', 'b', 'c', 'd', 'e', 'f']

If you would instead like to have the edge pairs instead of just the nodes:

list(nx.topological_sort(nx.line_graph(dag)))

which returns:

[('root', 'a'), ('a', 'b'), ('a', 'c'), ('b', 'd'), ('d', 'e'), ('d', 'f')]

Finally, if the jobs need to be in reverse order you can use reverse:

dag_reverse = dag.reverse()

Re-running the commands::

list(nx.algorithms.topological_sort(dag_reverse))

gives:

['c', 'e', 'f', 'd', 'b', 'a', 'root']

and:

list(nx.topological_sort(nx.line_graph(dag_reverse)))

gives:

[('c', 'a'), ('e', 'd'), ('f', 'd'), ('d', 'b'), ('b', 'a'), ('a', 'root')]

Upvotes: 0

vurmux

Reputation: 10030

It may seem strange, but this function is called ancestors :)

nx.ancestors(G, your_node)

Upvotes: 12

zohar.kom

Reputation: 1845

Using predecessors will only return the nodes with a direct edge to the input node. Finding all node's ancestors can be done as follows:

import networkx as nx
import matplotlib.pyplot as plt
G = nx.DiGraph()

# G is:
#      e-->f
#      ^
#      |
# a--->b-->c-->d
#

G.add_edges_from([('a', 'b'),('b', 'c'),('c', 'd'), ('b', 'e'), ('e', 'f')])
T = nx.dfs_tree(G.reverse(), source='f').reverse()     

# T is: a-->b-->e-->f

pos = nx.nx_pydot.pydot_layout(T, prog='dot')
nx.draw_networkx(T, pos=pos, arrows= True, with_labels=True)
plt.show()

What we do is simply running DFS from the input node on the reversed directed graph, and then reverse the result again to get the edges in their original direction.

The last three lines are for drawing the result.

Upvotes: 7

How to find all node&#39;s ancestors in NetworkX?

Answers (3)

Related Questions

How to find all node's ancestors in NetworkX?