Operating efficiently on submatrices of 2D numpy arrays

Question

I'm having trouble doing the following matrix operation efficiently. Starting from a square matrix (2D numpy array), and groups which span every index of the matrix (dictionary: keys are the groups, values are lists of matrix indices in the group), I need to obtain a new, smaller, matrix which contains the sum of the elements in each submatrix of the original matrix. The submatrices are defined according to the indices of the groups. The new matrix will therefore also be also square, but with the number of groups as its dimensions.

Let's look at the following example:

import numpy as np

X = np.arange(49).reshape((7, 7))

d = {0: [0, 1], 1: [2, 3, 4], 2: [5, 6]}

def get_new_matrix(matrix, groups_indexes):
    groups_number = len(groups_indexes)
    new_matrix = np.zeros((groups_number, groups_number))
    for i in range(groups_number):
        for j in range(groups_number):
            new_matrix[i][j] = np.sum(matrix[groups_indexes[i]][:,groups_indexes[j]])
    return new_matrix

Z = get_new_matrix(X, d)
print(Z)

[[ 16  39  36]
 [129 216 159]
 [156 249 176]]

Looking at the result, for example in the (second) row 1 and (third) column 2, we notice the result is 159, this is:

Z[1,2]

This means that in the original matrix, the submatrix defined by the groups 1 in the rows and 2 in the columns, this is rows 2, 3 and 4, and colums 5 and 6, is explicitely:

X[[2, 3, 4]][:,[5, 6]]

and the sum of all the elements in the submatrix is 19+20+26+27+33+34=159.

Explicitely:

np.sum(X[[2, 3, 4]][:,[5, 6]])

Is there any way to write a more pythonic code, avoiding the two for loops to obtain the new matrix and improving the efficiency overall? I guess it should be something like fancy indexing, broadcasting, etc., but I couldn't find a better solution yet.

My current code scales terribly for large initial matrices (and potentially large initial number of groups as well), and since I will run it not only for arbitrary large initial square matrices, but also during many iterations, I really need to improve it. Or maybe there is no way to make the code better, and an explanation will be very useful as well :)

Mad Physicist · Accepted Answer

If your group indices span the entire matrix and are contiguous, you can store them as just the indices instead of a dictionary. Since each group ends with the beginning of the next group, you only need to store the start index. Your current d could be rewritten as

d = sorted(val[0] for val in d.values())

Or, if you are not tied to the dictionary format, just

d = np.array([0, 2, 5])

My recommendation is to apply np.add.reduceat twice, once along each dimension, essentially as you are doing in your current loop, but having numpy manage the looping for you internally:

result = np.add.reduceat(np.add.reduceat(X, d, axis=0), d, axis=1)

The result for the input in the question is:

array([[ 16,  39,  36],
       [129, 216, 159],
       [156, 249, 176]])

159 is indeed the element at index [1, 2].

This seems to scale fairly well. Running with X = np.arange(10**6).reshape(10**3, 10**3) and d = np.arange(0, 10**3, 10) takes about 2.27ms on my not-very-overpowered laptop. I don't think that this bit of code is likely to be a bottleneck for anything you do.

Operating efficiently on submatrices of 2D numpy arrays

Answers (1)

Related Questions