Reputation: 33
I have a graph of nodes which each represent about 100 voxels in the brain. I partitioned the graph into communities, but now I need to make a correlation matrix where every voxel in a node is connected to every voxel in the nodes that are in the same community. In other words, if nodes 1 and 2 are in the same community, I need a 1 in the matrix between every voxel in node 1 and every voxel in node 2. This takes a very long time with the code below. Does anyone know how to speed this up?
for edge in combinations(graph.nodes(),2):
if partition.get_node_community(edge[0]) == partition.get_node_community(edge[1]): # if nodes are in same community
voxels1 = np.argwhere(flat_parcel==edge[0]+1) # this is where I find the voxels in each node, and I get the indices for the matrix where I want them.
voxels2 = np.argwhere(flat_parcel==edge[1]+1)
for voxel1 in voxels1:
voxel_matrix[voxel1,voxels2] = 1
Thanks for the responses, I think the easiest and fastest solution is to replace the last loop with voxel_matrix[np.ix_(voxels1, voxels2)] = 1
Upvotes: 2
Views: 730
Reputation: 151027
Here's an approach that I expect to work for you. It's a stretch on my machine -- even storing two copies of the voxel adjacency matrix (using dtype=bool
) pushes my (somewhat old) desktop right to the edge of its memory capacity. But I'm assuming that you have a machine capable of handling at least two (300 * 100) ** 2 = 900 MB arrays -- otherwise, you would probably have run into problems before this stage. It takes my desktop about 30 minutes to process 30000 voxels.
This assumes that voxel_communities
is a simple array containing a community label for each voxel at index i
. It sounds like you can generate that pretty quickly. It also assumes that voxels are present in only one node.
def voxel_adjacency(voxel_communities):
n_voxels = voxel_communities.size
comm_labels = sorted(set(voxel_communities))
comm_counts = [(voxel_communities == l).sum() for l in comm_labels]
blocks = numpy.zeros((n_voxels, n_voxels), dtype=bool)
start = 0
for c in comm_counts:
blocks[start:start + c, start:start + c] = 1
start += c
ix = numpy.empty_like(voxel_communities)
ix[voxel_communities.argsort()] = numpy.arange(n_voxels)
blocks[:] = blocks[ix,:]
blocks[:] = blocks[:,ix]
return blocks
Here's a quick explanation. This uses an inverse indexing trick to reorder the columns and rows of an array of diagonal blocks into the desired matrix.
n_voxels = voxel_communities.size
comm_labels = sorted(set(voxel_communities))
comm_counts = [(voxel_communities == l).sum() for l in comm_labels]
blocks = numpy.zeros((n_voxels, n_voxels), dtype=bool)
start = 0
for c in comm_counts:
blocks[start:start + c, start:start + c] = 1
start += c
These lines are used to construct the initial block matrix. So for example, say you have six voxels and three communities, and each community contains two voxels. Then the initial block matrix will look like this:
array([[ True, True, False, False, False, False],
[ True, True, False, False, False, False],
[False, False, True, True, False, False],
[False, False, True, True, False, False],
[False, False, False, False, True, True],
[False, False, False, False, True, True]], dtype=bool)
This is essentially the same as the desired adjacency matrix after the voxels have been sorted by community membership. So we need to reverse that sorting. We do so by constructing an inverse argsort array.
ix = numpy.empty_like(voxel_communities)
ix[voxel_communities.argsort()] = numpy.arange(n_voxels)
Now ix
will reverse the sorting process when used as an index. And since this is a symmetric matrix, we can perform the reverse sorting operation separately on columns and then on rows:
blocks[:] = blocks[ix,:]
blocks[:] = blocks[:,ix]
return blocks
Here's an example of the result it generates for a small input:
>>> voxel_adjacency(numpy.array([0, 3, 1, 1, 0, 2]))
array([[ True, False, False, False, True, False],
[False, True, False, False, False, False],
[False, False, True, True, False, False],
[False, False, True, True, False, False],
[ True, False, False, False, True, False],
[False, False, False, False, False, True]], dtype=bool)
It seems to me that this does something quite similar to voxel_matrix[np.ix_(voxels1, voxels2)] = 1
as suggested by pv., except it does it all at once, instead of tracking each possible combination of nodes.
There may be a better solution, but this should at least be an improvement.
Also, note that if you can simply accept the new ordering of voxels as canonical, then this solution becomes as simple as creating the block array! That takes all of about 300 milliseconds.
Upvotes: 1