How to apply same operation on every block of block matrices in numpy efficiently?

Question

I have a big 2d array as following shape:

B = [B_0, B_1, B_2, B_n]

where B_0, B_1, ..., B_n have the same number of rows, but the different number of columns and n may be very large. I also have another 1d array idx with shape (n+1,), and

B_i = B[:, idx[i]:idx[i+1]]

and idx[-1] (the last elements of idx) is the total number of columns of B.

I want to do same matrix operation for every B_i, for example:

B_i.T()@B_i

Or with another 2d array:

D = [[D_0], [D_1], ..., [D_n]]

with D_0, D_1, ..., D_n have the same number of columns which is equal to the number of rows of B, but the different number of rows, and

D_i = D[idx[i]:idx[i+1], :]

and I want to compute D_i@B_i.

So my question is how to implement it in python and avoid using the for loop?

The following is a example:

import numpy as np
from timeit import default_timer as timer
# Prepare the test data
n = 1000000 # the number of small matrix 

idx = np.zeros(n+1, dtype=np.int)
idx[1:] = np.random.randint(1, 10, size=n)
idx = np.cumsum(idx)

B = np.random.rand(3, idx[-1])

# Computation
start = timer()
C = []
for i in range(n):
    B_i = B[:, idx[i]:idx[i+1]]
    C_i = B_i.T@B_i
    C.append(C_i)
end = timer()
print('Total time:', end - start)

Huayi Wei · Accepted Answer

One can use map and lambda function to finish this job, please see the following code:

import numpy as np
from timeit import default_timer as timer
# Prepare the test data
n = 1000000 # the number of small matrix 

idx = np.zeros(n+1, dtype=np.int)
idx[1:] = np.random.randint(1, 10, size=n)
idx = np.cumsum(idx)

B = np.random.rand(3, idx[-1])
D = np.random.rand(idx[-1], 3)

BB = np.hsplit(B, idx[1:-1])
DD = np.vsplit(D, idx[1:-1])

CC = list(map(lambda x: x[0]@x[1], zip(DD, BB)))

How to apply same operation on every block of block matrices in numpy efficiently?

Answers (2)

Related Questions