Reputation: 1315
I'm implementing some similarity measures with python-igraph. Particularly common neighbors and preferential attachment.
Initially I have this:
#!/usr/bin/env python
# encoding: utf-8
import igraph
def preferential_attachment(g, i, j):
return g.degree(i) * g.degree(j)
def common_neighbors(g, i, j):
return len(set(g.neighbors(i)).intersection(g.neighbors(j)))
But I think there's a way to improve code performance. Does anyone have some idea on how to improve the performance of this code?
Upvotes: 0
Views: 1070
Reputation: 48071
Pre-calculate the neighbor sets in advance into an adjacency list and then just use the items from the adjacency list instead of querying the neighbors over and over. The same thing could also help with the degree calculations as there is no need to call a method, you can look up the degree from an array instead:
class PrecalculatedStuff(object):
def __init__(self, graph):
self.graph = graph
self.degrees = graph.degree()
self.adjlist = map(set, graph.get_adjlist())
def degree_product(self):
return self.degrees[i] * self.degrees[j]
def common_neighbors(self, i, j):
return self.adjlist[i].intersection(self.adjlist[j])
Also, the product of degrees is probably more efficient to calculate if you use NumPy - essentially you take the degree list, cast it into a NumPy vector and then multiply the vector (as a column vector) with its transpose (i.e. a row vector). The result is a matrix that contains the degree product for all pairs of nodes, and the looping is then done in C instead of in Python.
Upvotes: 2