Reputation: 11741
I have the following code:
import networkx
def reverse_graph(g):
reversed = networkx.DiGraph()
for e in g.edges():
reversed.add_edge(e[1], e[0])
return reversed
g = networkx.DiGraph()
for i in range(500000):
g.add_edge(i, i+1)
g2 = g.reverse()
g3 = reverse_graph(g)
And according to my line profiler I am spending WAY more time reversing the graph using networkx
(their reverse took about 21 sec, mine took about 7). The overhead seems high in this simple case, and it's even worse in other code I have with more complex objects. Is there something happening under the hood of networkx
I'm not aware of? This seems like it should be a relatively cheap function.
For reference, here is the doc for the reverse
function
EDIT: I also tried running the implementations the other way around (i.e. mine first) to make sure there was no cacheing happening when they created theirs. Mine was still significantly faster
Upvotes: 2
Views: 418
Reputation: 880429
The source code for the reverse method looks like this:
def reverse(self, copy=True):
"""Return the reverse of the graph.
The reverse is a graph with the same nodes and edges
but with the directions of the edges reversed.
Parameters
----------
copy : bool optional (default=True)
If True, return a new DiGraph holding the reversed edges.
If False, reverse the reverse graph is created using
the original graph (this changes the original graph).
"""
if copy:
H = self.__class__(name="Reverse of (%s)"%self.name)
H.add_nodes_from(self)
H.add_edges_from( (v,u,deepcopy(d)) for u,v,d
in self.edges(data=True) )
H.graph=deepcopy(self.graph)
H.node=deepcopy(self.node)
else:
self.pred,self.succ=self.succ,self.pred
self.adj=self.succ
H=self
return H
So by default, when copy=True
, not only are the edge nodes reversed,
but also a deepcopy of any edge data is made. Then the graph attributes (held in
self.graph
) are deepcopied, and then the nodes themselves are deepcopied.
That's a lot of copying that reverse_graph
does not do.
If you don't deepcopy everything, modifying g3
could affect g
.
If you don't need to deepcopy everything, (and if mutating g
is acceptable) then
g.reverse(copy=False)
is even faster than
g3 = reverse_graph(g)
In [108]: %timeit g.reverse(copy=False)
1000000 loops, best of 3: 359 ns per loop
In [95]: %timeit reverse_graph(g)
1 loops, best of 3: 1.32 s per loop
In [96]: %timeit g.reverse()
1 loops, best of 3: 4.98 s per loop
Upvotes: 2