Fabio
Fabio

Reputation: 59

Function over different arrays and indexes

I want to perform an operation over two arrays x = [1, 2, 3, ...] and y = [4, NaN, NaN, ...]. The goal is to compute the remaining values for array y using scalars and the previous index, for example, y[1] = (y[0] * 7 + x[1]) / 8, y[2] = (y[1] * 7 + x[2]) / 8, ... I have tried things along the lines of y[1:] = (y[0:len(y) - 1] * 7 + x[1:]) / 8, but get completely wrong results. How can I create this kind of operations in numpy? A for loop would be too inefficient?

Upvotes: 0

Views: 98

Answers (1)

mathfux
mathfux

Reputation: 5939

You are attempting to do like this:

for i in range(N):
    y[i+1] = (y[i] * 7 + x[i+1]) / 8 

So I'm pretty sure it's worth calling it 'looping in Python level'. From Introduction of numpy vectorized operations:

NumPy provides highly-optimized functions for performing mathematical operations on arrays of numbers. Performing extensive iterations (e.g. via ‘for-loops’) in Python to perform repeated mathematical computations should nearly always be replaced by the use of vectorized functions on arrays. This informs the entire design paradigm of NumPy.

There is no straightforward way of vectorization in your case. At least, you can try mathematical approach since this kind of relation is known widely

A relation of the form A * y[i+1] + B * y[i] == f(i) is said to be linear first-order non-homogeneous difference recurrence equation

In order to move further, you can rearrange this into:

y[n+1] = (7/8) * y[n] + x[n+1]/8

Based on this formula, a remaining part of solution looks like:

y[1] = (7/8) * y[0] + x[1]/8
y[2] = 7/8 * [(7/8) * y[0] + x[1]/8] + x[2]/8 = (7/8)**2*y[0] + 7/8*x[1]/8 + x[2]/8
...

So you can conclude that

y[n] = (7/8)**n*y[0] + x[n]/8 + (7/8)*x[n-1]/8 + (7/8)**2*x[n-2]/8 + ... + (7/8)**(n-1)*x[1]

Hence finding previous item y[i-1] is not required in order to find y[i] anymore. Now let's find [y[1], y[2], ... y[n]] in one go according to the formula that we've got in terms of general numpy broadcasting rules.

You need to add Y =

y[0]*(7/8)
y[0]*(7/8)**2
y[0]*(7/8)**3
       ...
y[0]*(7/8)**(n

with a result of product V =

1   0        0        0  ...     0
1 7/8        0        0  ...     0
1 7/8 (7/8)**2        0  ...     0
1 7/8 (7/8)**2 (7/8)**3  ...     0
.  .      .        .             .
·  ·      ·        ·             ·
'  '      '        '             '
1 7/8 (7/8)**2 (7/8)**3  ... (7/8)**(n-1)

and X =

 1  0  0  0  0  0  0  0  0  0 ...
 2  1  0  0  0  0  0  0  0  0 ...
 3  2  1  0  0  0  0  0  0  0 ...
 4  3  2  1  0  0  0  0  0  0 ...
 5  4  3  2  1  0  0  0  0  0 ...
 6  5  4  3  2  1  0  0  0  0 ...
...

I'll provide a numpy solution for this, not the best looking but you could refactor it:

n = 10
y = np.empty(n+1)
y[0] = 4
R = np.repeat([1, 7/8], [1, n])
Y = np.cumprod(R[1:])
V = np.tril(np.vander(R, increasing=True))[:-1,:-1]
X = np.lib.stride_tricks.sliding_window_view(np.r_[np.zeros(n-1), np.arange(1, n+1)], n)[:, ::-1]
T1, T2 = (Y*y[0], np.sum(V*X/8, axis=1))
y[1:] = T1 + T2
>>> y
[4.         3.625      3.421875   3.36914062 3.44799805 3.64199829
 3.9367485  4.31965494 4.77969807 5.30723581 5.89383134]

In general it's O(n²) in comparison with initial method which is O(n). If you know formulas of terms of np.sum(V*X, axis=1), this could reach O(n) and outperform the first way.

Upvotes: 1

Related Questions