bmello
bmello

Reputation: 1984

In networkx, how to update edges weight in a vectorized way?

I have to define a network where the weight of each edge must be equal to the number of connections between each pair of nodes. The following code generates such network:

In [1]: import networkx as nx

In [2]: g = nx.Graph()

In [3]: connections = [[1,2],[2,3],[1,2],[2,1],[1,4],[2,3]]

In [4]: for e1,e2 in connections :
    if g.has_edge(e1,e2) :
        g[e1][e2]['weight'] += 1
    else :
        g.add_edge(e1,e2,weight=1)
    ...:         

In [5]: g.edges(data=True)
Out[5]: [(1, 2, {'weight': 3}), (1, 4, {'weight': 1}), (2, 3, {'weight': 2})]

In the real situation, the connection list will contain thousands of pairs. Thousands of such lists will be generated, and each of them must be immediately included in the network and deleted since there is no memory to store all lists together.

Since Python is an interpreted language, I can not use the command "for", because it would take forever to run. Maybe "vectorize" is not the proper work, what I mean is something similar to what we do with numpy arrays, where there are commands that operate on all elements at once, instead of using the command "for" to operate in each element.

Upvotes: 1

Views: 2614

Answers (2)

Kirell
Kirell

Reputation: 9798

I am afraid you will need a for loop in any case but it's not that slow. Networkx is actually quite slow in general because of the way it stores nodes and edges (as dict). If you want to apply functions to some attributes using numpy I suggest you try graph-tool instead.

Concerning the issue at hand, I think I have a better way:

import networkx as nx
import numpy as np
from collections import Counter

# This will yield weighted edges on the fly, no storage cost occurring 
def gen_edges(counter):
    for k, v in counter.iteritems():  # change to counter.items() for Py3k+
        yield k[0], k[1], v

g = nx.Graph()
# Your edge list needs to be in the form of tuples
# this map loop doesn't count
connections = map(tuple, [[1,2],[2,3],[1,2],[2,1],[1,4],[2,3]])

# Create histogram of pairs using native Python collections
c = Counter(connections)
# Add weighted edges
g.add_weighted_edges_from(gen_edges(c))

print nx.info(g)
print g.edges(data=True)

Output:

Name: 
Type: Graph
Number of nodes: 4
Number of edges: 3
Average degree:   1.5000

[(1, 2, {'weight': 1}), (1, 4, {'weight': 1}), (2, 3, {'weight': 2})]

Note that you cannot use numpy.unique to count the histogram of edges because it flattens the array.

Upvotes: 1

Medhat Gayed
Medhat Gayed

Reputation: 2813

Try using the map or imap method of the Pool class of the multiprocessing library.

https://docs.python.org/2/library/multiprocessing.html

You can then create a function which checks and adds the edges and get imap to execute your function for each element in parallel.

Check the example at the bottom of this link:

https://docs.python.org/2/library/multiprocessing.html#module-multiprocessing.pool

Upvotes: 0

Related Questions