Reputation: 1984
I have to define a network where the weight of each edge must be equal to the number of connections between each pair of nodes. The following code generates such network:
In [1]: import networkx as nx
In [2]: g = nx.Graph()
In [3]: connections = [[1,2],[2,3],[1,2],[2,1],[1,4],[2,3]]
In [4]: for e1,e2 in connections :
if g.has_edge(e1,e2) :
g[e1][e2]['weight'] += 1
else :
g.add_edge(e1,e2,weight=1)
...:
In [5]: g.edges(data=True)
Out[5]: [(1, 2, {'weight': 3}), (1, 4, {'weight': 1}), (2, 3, {'weight': 2})]
In the real situation, the connection list will contain thousands of pairs. Thousands of such lists will be generated, and each of them must be immediately included in the network and deleted since there is no memory to store all lists together.
Since Python is an interpreted language, I can not use the command "for", because it would take forever to run. Maybe "vectorize" is not the proper work, what I mean is something similar to what we do with numpy arrays, where there are commands that operate on all elements at once, instead of using the command "for" to operate in each element.
Upvotes: 1
Views: 2614
Reputation: 9798
I am afraid you will need a for loop in any case but it's not that slow. Networkx is actually quite slow in general because of the way it stores nodes and edges (as dict). If you want to apply functions to some attributes using numpy I suggest you try graph-tool instead.
Concerning the issue at hand, I think I have a better way:
import networkx as nx
import numpy as np
from collections import Counter
# This will yield weighted edges on the fly, no storage cost occurring
def gen_edges(counter):
for k, v in counter.iteritems(): # change to counter.items() for Py3k+
yield k[0], k[1], v
g = nx.Graph()
# Your edge list needs to be in the form of tuples
# this map loop doesn't count
connections = map(tuple, [[1,2],[2,3],[1,2],[2,1],[1,4],[2,3]])
# Create histogram of pairs using native Python collections
c = Counter(connections)
# Add weighted edges
g.add_weighted_edges_from(gen_edges(c))
print nx.info(g)
print g.edges(data=True)
Output:
Name:
Type: Graph
Number of nodes: 4
Number of edges: 3
Average degree: 1.5000
[(1, 2, {'weight': 1}), (1, 4, {'weight': 1}), (2, 3, {'weight': 2})]
Note that you cannot use numpy.unique
to count the histogram of edges because it flattens the array.
Upvotes: 1
Reputation: 2813
Try using the map
or imap
method of the Pool
class of the multiprocessing
library.
https://docs.python.org/2/library/multiprocessing.html
You can then create a function which checks and adds the edges and get imap
to execute your function for each element in parallel.
Check the example at the bottom of this link:
https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool
Upvotes: 0