Reputation: 21343
I have a sparse weighted directed graph represented in a file with each line in the format
from to weight
I would like to read it into scipy's compressed sparse format so I can perform simple traversal and graph algorithms on it (or in fact any memory efficient representation). However, given a node I would like to be able to list all its outgoing edges in order of weight quickly without having to sort them each time. It's ok to sort each one once of course.
Is it possible to do this in scipy or using any other python package?
Upvotes: 2
Views: 2130
Reputation: 67457
You can load your data with something like this:
import numpy as np
import scipy.sparse as sps
data = np.genfromtxt('data.txt', dtype=[('from', np.intp),
('to', np.intp),
('weight', np.float)])
If you want to store your weights in a sparse matrix graph
, where graph[i, j]
is the weight from node i
to node j
, you could do:
graph = sps.csr_matrix((data['weight'], (data['from'], data['to'])))
To have a sorted list of outgoing nodes, I would use a dictionary, where sorted_to
is an array of outgoing nodes sorted by weight. It's a little hacky, and dependent on the CSR sparse matrix format, but you could do:
graph = sps.rand(10, 10, density=0.1, format='csr')
data, indptr, indices = graph.data, graph.indptr, graph.indices
non_empty_rows, = np.nonzero(np.diff(graph.indptr))
sorted_out = {}
for j in non_empty_rows:
weight_slice = data[indptr[j]:indptr[j+1]]
out_slice = indices[indptr[j]:indptr[j+1]]
sorted_out[j] = out_slice[np.argsort(weight_slice)]
With a simple made up example:
>>> graph = sps.rand(5, 5, density=0.2, format='csr')
>>> graph.toarray()
array([[ 0.88968871, 0. , 0. , 0.80773932, 0. ],
[ 0. , 0. , 0.8921645 , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0. ],
[ 0.18552664, 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. , 0.22945956]])
>>> non_empty_rows
array([0, 1, 3, 4], dtype=int64)
>>> sorted_out
{0: array([3, 0]), 1: array([2]), 3: array([0]), 4: array([4])}
Upvotes: 2