Tunneller
Tunneller

Reputation: 589

concatenated matrix multiplication

any suggestions for a fast multiply of A * diag(e) * A^T * f for dense matrix A and vectors e, f?

This is what I have now.

v[:] = 0
for i in range(N):
    for j in range(N):
        v[i] = v[i]+A[i,j]*e[j]*np.dot(A[:,j],f)

Thanks,

Upvotes: 2

Views: 853

Answers (2)

Divakar
Divakar

Reputation: 221514

Comments by @rubenvb's, where it was suggested to use A.dot(np.diag(e)).dot(A.transpose()).dot(f) should make it really fast. But, we don't really need to make it a 2D array of diag(e) there and thus skip one matrix-multiplication. Additionally, we can swap places for A.T and f and thus avoid the transpose too. Thus, a simplified and much more efficient solution would evolve, like so -

A.dot(e*f.dot(A))

Here's a quick runtime test on decent sized arrays on all the proposed approaches -

In [226]: # Setup inputs
     ...: N = 200
     ...: A = np.random.rand(N,N)
     ...: e = np.random.rand(N,)
     ...: f = np.random.rand(N,)
     ...: 

In [227]: %timeit np.einsum('ij,j,kj,k', A, e, A, f) # @Warren Weckesser's soln
10 loops, best of 3: 77.6 ms per loop

In [228]: %timeit A.dot(np.diag(e)).dot(A.transpose()).dot(f) # @rubenvb's soln
10 loops, best of 3: 18.6 ms per loop

In [229]: %timeit A.dot(e*f.dot(A)) # Proposed here
10000 loops, best of 3: 100 µs per loop

Upvotes: 2

Warren Weckesser
Warren Weckesser

Reputation: 114781

The suggestion made by @rubenvb is probably the simplest way to do it. Another way is to use einsum.

Here's an example. I'll use the following a, e and f:

In [95]: a
Out[95]: 
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [96]: e
Out[96]: array([-1,  2,  3])

In [97]: f
Out[97]: array([5, 4, 1])

This is the direct translation of your formula into numpy code. It is basically the same as @rubenvb's suggestion:

In [98]: a.dot(np.diag(e)).dot(a.T).dot(f)
Out[98]: array([ 556, 1132, 1708])

Here's the einsum version:

In [99]: np.einsum('ij,j,jk,k', a, e, a.T, f)
Out[99]: array([ 556, 1132, 1708])

You can eliminate the need to transpose a by swapping the index labels associated with that argument:

In [100]: np.einsum('ij,j,kj,k', a, e, a, f)
Out[100]: array([ 556, 1132, 1708])

Upvotes: 1

Related Questions