Reputation: 116
I have the following for loop that operates over three numpy arrays of the same length:
n = 100
a = np.random.random(n)
b = np.random.random(n)
c = np.random.random(n)
valid = np.empty(n)
for i in range(n):
valid[i] = np.any(a[i] > b[i:] + c[i:].cumsum())
Is there a way to replace this for loop with some vectorized numpy operations?
For example, because I only care if a[i]
is larger than any value in b[i:]
, I can do np.minimum.accumulate(b[::-1])[::-1]
which gets the smallest value of b
at every index and onwards, and then compare it to a
like this:
a > np.minimum.accumulate(b[::-1])[::-1]
but I still would need a way to vectorize the c[i:].cumsum()
into a single array calculation.
Upvotes: 2
Views: 502
Reputation: 3066
I assume you want to vectorize it to decrease the running time. Since you are only using pure NumPy operations, you can use numba
: see 5 Minutes Guide to Numba
it will look something like this:
import numba
@numba.njit()
def valid_for_single_idx(idx, a, b, c):
return np.any(a[idx] > b[idx:] + c[idx:].cumsum())
valid = np.empty(n)
for i in range(n):
valid[i] = valid_for_single_idx(i, a, b, c)
So far it isn't really vectorization (as the loop still happens), but it translate the the numpy line into llvm, so it happens as fast as probably possible.
Although it's not increasing the speed, but looks a bit nicer, you can use .map
:
import numba
from functools import partial
@numba.njit()
def valid_for_single_idx(idx, a, b, c):
return np.any(a[idx] > b[idx:] + c[idx:].cumsum())
valid = map(partial(valid_for_single_idx, a=a, b=b, c=c), range(n))
Upvotes: 2
Reputation: 114440
Your goal is to find the minimum of b[i:] + c[i:].cumsum()
for each i
. Clearly you can compare that to a
directly.
You can write the elements of c[i:].cumsum()
as the upper triangle of a matrix. Let's look at a toy case with n = 3
:
c = [c1, c2, c3]
s1 = c.cumsum()
s0 = np.r_[0, s1[:-1]]
You can write the elements of the cumulative sum as
c1, c1 + c2, c1 + c2 + c3 s1[0:] s1[0:] - s0[0]
c2, c2 + c3 = s1[1:] - c1 = s1[1:] - s0[1]
c3 s1[2:] - (c1 + c2) s1[2:] - s0[2]
You can use np.triu_indices
to construct these sums as a raveled array:
r, c = np.triu_indices(n)
diff = s1[c] - s0[r] + b[c]
Since np.minimum
is a ufunc, you can accumulate diff
for each run defined by r
using minimum.reduceat
. The locations are given roughly by np.flatnonzero(np.diff(r)) + 1
, but you can generate them faster with np.arange
:
m = np.minimum.reduceat(diff, np.r_[0, np.arange(n, 1, -1).cumsum()])
So finally, you have:
valid = a > m
TL;DR
s1 = c.cumsum()
s0 = np.r_[0, s1[:-1]]
r, c = np.triu_indices(n)
valid = a > np.minimum.reduceat(s1[c] - s0[r] + b[c], np.r_[0, np.arange(n, 1, -1).cumsum()])
Upvotes: 2