Reputation: 2956

How to efficiency set matrix / array diagonals based on rest of column (avoiding a loop?)

I'm trying to code a solution that takes a function, and sets the central diagonal values (i.e. along A[0,0], A[1,1], ...., A[N,N]) based on summing the values in the column below the diagonal cell.

An example:

A = np.array([[0, 0, 0],
              [3, 0, 0],
              [4, 2, 0]])

B = function_to_be_built(A)

B = np.array([[7, 0, 0],
              [3, 2, 0],
              [4, 2, 0]])

where you can see that addition operators for cell [0,0] = 3 + 4 = 7 and for cell [1,1], 2 = 2.

Of course this could be achieved relatively easily using a for loop, but the code is designed to be in a central kernel, thus efficiency is critical. I feel like there should be a way to achieve this efficiently using numpy.... something like using np.tril so select only the values below the diagonal?

Could anyone perhaps help me get to the solution please?

Thanks.

Upvotes: 1

Answers (3)

Paul Panzer

Reputation: 53029

Here is a solution using np.add.at

y, x = np.tril_indices(len(A), -1)
np.add.at(A, [x,x], A[y,x])

This adds the below-diagonal column sums in-place to the corresponding diagonal elements. If these are not guaranteed to be zero, do a

A[np.arange(len(A)), np.arange(len(A))] = 0

before.

Note that in either case this method does not require the upper triangle to be zero.

If this operation is repeated a lot on matrices of the same size, then indices can be precomputed and flattened for increased performance.

# compute once
flatinds = np.ravel_multi_index((y, x), A.shape)
flatdiag = np.ravel_multi_index((x, x), A.shape)
flshdiag = (len(A)+1) * np.arange(len(A))
# use every iteration
Afl = A.ravel()
Afl[flshdiag] = 0 # only if necessary
np.add.at(Afl, flatdiag, Afl[flatinds])

Couldn't resist going for a really fast one. It leverages linear indexing and np.add.reduceat. Since most sums are along relatively long stretches of column it pays off to do a full (non-lazy) transpose at the beginning and end of the computation. For N >= 5 it consistently beats all the other contenders. If you can keep the matrices involved in column-major (Fortran) order, the two transposes could even be saved in which case it is faster than all the others starting from N == 3.

# precompute this (A.shape == (N, N))
finds = np.c_[N * np.arange(N), 1 + (N+1) * np.arange(N)].ravel()[1:-2].copy()

def PPt(A):
    A = A.T.copy()
    Af = A.ravel()
    Af[:-1:N+1] = np.add.reduceat(Af[:-N], finds)[::2]
    Af[-1] = 0
    return A.T.copy()

Upvotes: 2

Bas Swinckels

Reputation: 18488

If you are sure that the diagonal and upper diagonal are filled with zeros, you can do:

In [43]: B = A + np.diag(A.sum(axis=0))

In [44]: B
Out[44]: 
array([[7, 0, 0],
       [3, 2, 0],
       [4, 2, 0]])

If you cannot guarantee that these areas are zero, you can do:

In [62]: B = A + np.diag(np.tril(A, -1).sum(axis=0))

In [63]: B
Out[63]: 
array([[7, 0, 0],
       [3, 2, 0],
       [4, 2, 0]])

Upvotes: 1

DYZ

Reputation: 57033

You can take the lower triangular part of A, sum it column-wise, convert to a diagonal matrix, and add back to A:

A + np.eye(A.shape[0]) * np.tril(A).sum(axis=0)

Upvotes: 2

How to efficiency set matrix / array diagonals based on rest of column (avoiding a loop?)

Answers (3)

Related Questions