Is there a function to calculate the sum of numpy ndarray elements returning a 1d array result?

Question

simple example:

a = array([[[1, 0, 0],
        [0, 2, 0],
        [0, 0, 3]],

       [[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1]]])

result = []

for i in a:
    result.append(i.sum())

result = [6, 3]

Is there a numpy function doing this faster? If it helps: a contains only diagonal matrices.

Edit: I just realized that a contains scipy csc_sparse matrices, i.e. its a numpy 1D array containing matrices and i can not apply the sum function with axis=(1, 2)

norok2 · Accepted Answer

A proper use of the axis parameter of np.sum() would do:

import numpy as np


np.sum(a, axis=(1, 2))
# [6, 3]

While the above should be generic preferred method, if your input is actually diagonal over axis 1 and 2, then summing all the zeros is bound to be inefficient (read O(n² k) with same n and k as the gen_a() function below). Using np.sum() after np.diag() inside a loop can be much better (read O(n k) with same n and k as before). Possibly, using a list comprehension is the way to go:

import numpy as np


np.array([np.sum(np.diag(x)) for x in a])
# [3, 6]

To give some idea of the relative speed, let's write a function to generate inputs of arbitrary size:

def gen_a(n, k):
    return np.array([
        np.diag(np.ones(n, dtype=int))
        if i % 2 else
        np.diag(np.arange(1, n + 1, dtype=int))
        for i in range(k)])


print(gen_a(3, 2))
# [[[1 0 0]
#   [0 2 0]
#   [0 0 3]]

#  [[1 0 0]
#   [0 1 0]
#   [0 0 1]]]

Now, we can time for different input size. I have also included a list comprehension without the np.diag() call, which is fundamentally a slightly more concise version of your approach.

a = gen_a(3, 2)
%timeit np.array([np.sum(np.diag(x)) for x in a])
# 100000 loops, best of 3: 16 µs per loop
%timeit np.sum(a, axis=(1, 2))
# 100000 loops, best of 3: 4.51 µs per loop
%timeit np.array([np.sum(x) for x in a])
# 100000 loops, best of 3: 10 µs per loop

a = gen_a(3000, 2)
%timeit np.array([np.sum(np.diag(x)) for x in a])
# 10000 loops, best of 3: 20.5 µs per loop
%timeit np.sum(a, axis=(1, 2))
# 100 loops, best of 3: 17.8 ms per loop
%timeit np.array([np.sum(x) for x in a])
# 100 loops, best of 3: 17.8 ms per loop

a = gen_a(3, 2000)
%timeit np.array([np.sum(np.diag(x)) for x in a])
# 100 loops, best of 3: 14.8 ms per loop
%timeit np.sum(a, axis=(1, 2))
# 10000 loops, best of 3: 34 µs per loop
%timeit np.array([np.sum(x) for x in a])
# 100 loops, best of 3: 8.93 ms per loop

a = gen_a(300, 200)
%timeit np.array([np.sum(np.diag(x)) for x in a])
# 1000 loops, best of 3: 1.67 ms per loop
%timeit np.sum(a, axis=(1, 2))
# 100 loops, best of 3: 17.8 ms per loop
%timeit np.array([np.sum(x) for x in a])
# 100 loops, best of 3: 19.3 ms per loop

And we observe that depending on the value of n and k one or the other solution gets faster. For larger n, the list comprehension gets faster, but only if np.diag() is used. On the contrary, for smaller n and larger k, np.sum() raw speed can outperform the explicit looping.

Is there a function to calculate the sum of numpy ndarray elements returning a 1d array result?

Answers (1)

Related Questions