Reputation: 158

How to vectorize this for loop?

I have a numpy array f with length n and a numpy matrix A with size n x m. I want to break f and A in r pieces f1,...,fr and A1,...,Ar, then make the computations fi*Ai (the vector x matrix multiplication in the mathematical sense) with each fi being a row vector with the number of columns equal the number of rows of Ai. The result will be a row vector 1 x m. The idea is to concatenate all these row vectors to form the matrix B = [ [f1*A1], [f2*A2],..., [fr*Ar] ] (note that this will be a matrix with size r x m).

Suppose f and A are already defined. Also suppose the corresponding index of the pieces are in a list [0,d1,...dr]. For example, f1 = f[d[0]:d[1]] and f2 = f[d[1]:d[2]] ). I was using with the following piece of code to solve my problem:

B = numpy.zeros([r,m])
for i in range(0,r):
    lower = d[i]
    upper = d[i+1]
    B[i,:] = f[lower:upper].dot(A[lower:upper,:])

The problem is that this piece of code will be computed several times in my program. I heard before that Python for loops are slow and in fact the bottle neck of my code is this part. I can't figure out how to vectorize this but I feel it is possible. I was hoping someone here could show me the way. Thanks.

Upvotes: 0

Answers (2)

hpaulj

Reputation: 231738

I assume this is a valid MCVE:

In [139]: f = np.arange(10)
In [140]: A = np.arange(20).reshape(10,2)
In [141]: f.dot(A)
Out[141]: array([570, 615])
In [142]: d = [0,2,5,10]
In [143]: for i,j in zip(d[:-1],d[1:]):
     ...:     print(f[i:j].dot(A[i:j,:]))
     ...:     
[2 3]
[58 67]
[510 545]

where 570 = 2+58+510.

In [145]: np.array([f[i:j].dot(A[i:j,:]) for i,j in zip(d[:-1],d[1:])])
Out[145]: 
array([[  2,   3],
       [ 58,  67],
       [510, 545]])

Given that the i:j slices can vary in length, it may be hard to 'vectorize' this in the true sense. We can hide the iterations, but writing it in a way that moves all iterations into compiled code will be tricky. Accumulative operations like cumsum are often the best bet. We often have to step back and look at the problem from a different perspective (as opposed to simply removing the loop).

numba and cython are often used to speed up iterative solutions, but I won't get into those.

If d divides the arrays into equal parts, we can use reshaping to calculate the pieces:

In [228]: A.shape
Out[228]: (10, 2)
In [229]: f.shape
Out[229]: (10,)
In [230]: f2 = f.reshape(2,5)
In [231]: A2 = A.reshape(2,5,2)

In [233]: np.einsum('ij,ijk->ik',f2,A2)
Out[233]: 
array([[ 60,  70],
       [510, 545]])

The matmul operator also works, though it requires some fiddling with the dimensions:

In [236]: (f2[:,None,:]@A2)[:,0,:]
Out[236]: 
array([[ 60,  70],
       [510, 545]])

If d divides the arrays into just a couple of sizes, I imagine we could group common sizes, and perform the above reshape and einsum on each group, but I haven't worked out the details:

In [238]: d = [0,2,5,7,10]
In [239]: np.array([f[i:j].dot(A[i:j,:]) for i,j in zip(d[:-1],d[1:])])
Out[239]: 
array([[  2,   3],
       [ 58,  67],
       [122, 133],
       [388, 412]])
In [240]: [f[i:j] for i,j in zip(d[:-1],d[1:])]
Out[240]: [array([0, 1]), array([2, 3, 4]), array([5, 6]), array([7, 8, 9])]

Here we have 2 groups, one of length 2 and other length 3.

Upvotes: 1

Paul Panzer

Reputation: 53119

You can use np.add.reduceat:

# example data
>>> f = np.arange(10)
>>> A = np.arange(50).reshape(10, 5)
>>> split = [0, 3, 5, 10]
>>> 
# reduceat
>>> np.add.reduceat(f[:, None] * A, split[:-1], axis=0)
array([[  25,   28,   31,   34,   37],
       [ 125,  132,  139,  146,  153],
       [1275, 1310, 1345, 1380, 1415]])
>>> 
# double check against list comprehension
>>> [fi @ Ai for fi, Ai in zip(*map(np.split, (f, A), 2*(split[1:-1],)))]
[array([25, 28, 31, 34, 37]), array([125, 132, 139, 146, 153]), array([1275, 1310, 1345, 1380, 1415])]

I wouldn't be surprised if the list comprehension or @hpaulj's solution or OP's loop were faster because of blas accelerated matrix multiplication.

Upvotes: 1

How to vectorize this for loop?

Answers (2)

Related Questions